 So thank you for your coming to this session and I'm Takashi Kajinami from Japan from entity data And then I'm a platform engineer in entity data So I'm so glad and excited to get a chance for introducing our activity about OpenStack Swift to you and In this project we constructed multi regional multi petabyte and multi crusted OpenStack OpenStack Swift object storage So it was so big project and there was some challenges in that project So today I introduce what we did in the project and I hope that my presentation will be a help Who just started your working on OpenStack Swift? so Okay, so let me start with the disclaimer of this presentation So first any product name service name software name and other marks are trade Marks of registered mark of crusted funding companies So and the second this presentation is a purpose of providing the knowledge gained from our first and biggest Swift project so in this project I particularly focus on our challenge and Solutions for them in the project and what we faced what we considered and what they did in that project So they're finally a presenter and their entity data Corporation provide the information in as its basis and have no Responsiveness for results that you got according to information in this presentation material of course Strations should be different in each project So we cannot guarantee your complete success in your in your project But however, I'm pretty glad if this presentation will be a part of help for you Who just started or not planning for your swift project? So first of all, I mean to I should introduce myself and ourselves So I'm Takashi Kazunami and I'm a platform engineer in entity data so entity data is a subsidiary of entity and Japanese system integration company who mainly works on the construction of IT systems so any entity data our sector is something like OSS professional team and Many focus on the open source software of the construct system. So we are working on like post-grassql I think Okay, okay now the presentation has changed so we are working on PostgreSQL or Hadoop or Henna most and so on and clothing up and stuck So our team is particularly focused on open-stuck swift and provide cloud storage solution toward my open-stuck swift So here I show my agenda for my presentation. So first I will shortly about Talk about 3 key features about open-stuck swift So I will talk about open-stuck swift shortly And then I move my focus to our open-stuck swift project and explain our full big Challenges and solution for them in that project The finally I summarize our activity in the project and introduce our future vision with swift So the open-stuck swift so open-stuck swift is a part of open-stuck project and it's a stretch project so switch open-stuck swift realize this muted object stretch like Amazon S3 and Object stretch is a new style stretch with different interface from conventional like something like block storage or file systems and Swift provides less for API and the clients can upload data into storage as boot request and Download data from storage as get request. So this rest API works on the HTTP protocol And so swift is often used as archiving storage for web contents like photo and the bigger video Or storage for backup data or archiving data So swift has many many good features, but I today I have not so much time to explain So I'll talk about three key features of swift. Let's see things durability, scalability and openness The first swift the first key feature is the durability So swift makes some copies in of data in stretch cluster for example three copies And this with copies of our devices knows and works as Uniquely as possible. So even if some parts of swift cluster fails you can protect data from failures and Continue to access all of the data with remaining copies in addition swift also automatically detects disk feed defeats and Heal missing data copies in out of the bus working properly Recently from grizzly and Habana release, so it gets global cluster features. So feature labels Geographically distributed storage cluster over multiple data centers to realize disaster recovery. So now you can Overcome all of the this kind of defeats The second feature is scalability So swift distributed data over multiple devices and the thing you add new devices each Rebalance data to the new devices. So when you enlarge capacity So you can enlarge your switch cost of capacity and improve performance of a swift cluster by adding new knobs adding new devices to your cluster and We can extend the storage from small capacity like 10 terabytes to huge capacity like dozens of petabytes So in addition, there are no limitations of the number in the number of devices You can add so you can extend the switch cluster flexibly So you can add capacity as much as you need when you need it and you can adopt your stretch Unpredictable market situation with effective cost The last key feature is other openness. So swift is the open source software And it's works on Python framework. So you can drive it on commodity IE servers and the Linux So you don't need any special devices for swift and you can select cost effective hard wheel to construct huge storage cluster in addition You can flexibly mix some types of servers in and you can select cost effective hard wheels construct huge storage so as Described in this figure you can mix some types of hard wheel as and construct one huge swift cluster So you can add the latest servers when you extend and the on the other hand You can remove all the servers when they get broken after the maintenance period So you can keep your swift cluster for a long time regardless on the maintenance period of servers with replacing all the servers to new servers Okay, so now I've explained about three key features of a swift and then let me let me move to our project So our project Well the migration project from existing high-end storage to distribute Distributed storage. So this storage serves as a back-end storage for application in communication platform company and Stores more than petabytes data from more than 10 million end users so this project required a high durability including disaster recovery and Then sake scalability and cost optimization for storage and our customers selected to Selected to use swift This project was so exciting so cool project But as you know we can finish no project without any challenges So in this project there are four big challenges which I show in this slide And then I will explain about these challenges more in detail and also our solutions for them The first challenge is assurance of data durability of a swift So Japanese customers are often very sensitive about the quality of the system and our customers are extremely Super-quality crazy customer. So in this system are the culture system design is a very important thing And everything should be under control in their system and you should design behavior of the system Not only in normal situation, but also in failure situation. So In swift it's not so easy thing to design all of the behavior Because swift is distributed system and many components on the many servers Co-work to build the whole function But we have to analyze every behavior before building system. That was a big for problem We faced first to solve this problem. So we considered we tried To overcome that problem and we decided to do the recovery test So we made hundreds of test cases based on three aspects So described in this slide the first one is the point of failure So we change the point of failure like disks like a nose process And then the second point is the number of failure. So the swift has a copy Swift copies data Over devices so the number of failure is one point which changes the behavior of swift. So we change the number of failure like one two three four and we tested these varieties of numbers and Then the last point is a range of failure So there are some field domain in swift and it distributes data about the isk noble knows of a wax about a Geographical location so the range of failure may change the behavior of swift so we tested it In addition to we are testing we checked source code of sweat with comparing the result of the testing and extract Collogical of a sweat which defines baby over swift So as a result of this recovery test We completely analyzed the behavior of swift and ensure its extreme durability availability and the recoverability switch can keep data and Continuous to work well, even if there are no sniper who akiva to a brexit three hard disk drives Storing the same data from some thousands of hard disk drives or a no big disaster which suffers all over you that seat those Okay, so now the first challenge is solved and then we face the second challenge So our second challenge is a global distribution. So disaster recovery is required in this project So there are many many customers who are interested in disaster recovery in Japan and our customers are one of such such kind of customer and We decided to distribute functional data of sweat over three sides like figured in This slide so we place two proxies. So we distribute to proxy cluster over two sides and Strange cluster also distribute strange cluster over two sides so each data centers are more than 300 kilometers away from another one so you can keep it and access to data and Even if one of the sites downs because of unexpected disaster now The placement is decided, but we have to check if switch works well on such distributed construction So we have two points. We have that we have to evaluate to Realize this global distribution. So the first one is client request So fin client request is some request is some process to switch So proxy sub atox the storage server and all the process or transfer data So in global construction latest in between proxy and the strange may affect this token The second one is durability So in swift each strange node talk to another one to ensure all copies are stored in the other one So and in global distributed cluster, they have to talk of network with latency So we have to ensure that effect of latency So to test globally distributed switch cluster like I figured in Previous slide. So we constructed showed this global cluster with simulated network latency. So we constructed this kind Should go across the figure then this split slide and we test this cluster So we assume the proxy server and the Swiss Swiss servers were placed in different locations And the simulated network latency among them. So we change the simulated latency from 10 milliseconds to 200 milliseconds and check how behavior of switch changes TC is used to simulate network latency. For example, when you want to set 10 milliseconds latency You set 5 milliseconds latency for one way then for round trip 10 milliseconds latency is set So with this should global cluster we tested the two things the first to ensure that switch can serve for client request Properly, we tested object putting and getting and deleting on this should Global cluster. So we checked its health by error rate of request and performance by Turn around time of one request and throughput and evaluate the effect of latency between proxy and the strange And the second do you ensure the durability of global construction? So we tested auto recovery feature of object replicator. So which recovery Which recover object data lost because of this failure So we checked we also checks its first by error rate of in object replicator and performance of object replicator by turn around time of one thing process and Through it and evaluate the effect of latency between storage nodes Okay, so here I show a result of our global cluster the first global cluster testing for client request So do you know error caused by latency so switch works well on network with latency? so as you can imagine so response time is Degrades as latency grows up But you can wear was effective throughput and conquer with concurrency So if you use concurrence to request you can get effective throughput So we make application on the swift to send the concurrent request to swift and get effective throughput in that project so we also make some formulation about turn one time on the latency and estimate turn How long does it take to serve one request and to decide a proper timeout parameter in that application? Okay, so then I will show you second result so our result for object replicator testing so we can see result is very similar to one for client request so do you know error caused by latency and Performance of one thing forces degrades as latency grows up But you can realize effective throughput with concurrent processing So we increase the concurrency of object replicator to get effective throughput of auto recovery in order to ensure its durability Okay, then the third challenge so the third challenge is the quality and delivery problem so how In this project we have to build hundreds of swift nodes in two weeks It's not so long time for building such huge number of nodes in addition There are so many customers rules about operation. For example, you should type all over commands by your hands and You should check if your command finishes successfully in each step You should check files before you edit them and so on so we noticed that we cannot keep delivery with Manual building in such rule So to solve this delivery problem we constructed our automated building tools for swift nodes So with this tool we can automatically install softwares on servers and configure all of them So we use kickstart for installing and the puppet for configuration and install about 10 softwares including always swift and monitoring software and so on and in general it's 50 configurations of All of OS function and other softwares Okay, so with this automated building tool we realize the speed up of our building and now we can build more than 200 nodes in one day. So this speed is about 100 times as fast as manual building used to do So in addition In addition to the speeding up So we also improve the quality of building by eliminating human error in building process So human sometimes makes mistakes like type of commands or skipping of lines and so on so in this too Everything is automated and all you have to do is the start your building tool so you can Elaborate the quality before you build actual environment by testing building tools in your test environment Okay, so now We can build all of the swift nodes within the time limit and everything seems to work well But unfortunately we asked more from our customers So our quality crazy customers now never gets satisfied with our saying now everything seems to work well They also never allow our saying everything smart everything works well, so they demand the proof So they have to we have to prove that everything works properly in swift We have two aspects about this proof. The first one is API So we have to prove that swift cost of working properly as a whole cluster The second one is node. So we have to prove that all of the swift nodes works properly We have to check these aspects in building process and as you imagine so they are no There are so many things we have to check But unfortunately, we didn't have no more time So we have to build and test swift cluster in two weeks. It's a big problem. It was so big problem in this project So how did we do that? Our solution is the automation of the testing. So in this test in this project to solve this big Quality and delivery problem. We also constructed Automated to testing tools in addition to automated building to so within this testing tool We can test the response from swift cluster and all responses from swift nodes including not only normal Responses but also error response is something like client error or server error To realize this testing we extended tempest to test To test error cases and made the other check to to take check behavior of all of the swift nodes in the cluster Okay, so now with testing tool we can prove that swift works properly and ensure that the Quality of building so we realize complete coverage about swift API While we also realize complete coverage about swift nodes So everything is tested and proved to work properly So in addition all tests are executed automatically as I said and we can test everything in short time So in this project we can test the one swift clusters with consists of which consists of 70 swift nodes in one hour only in one hour Okay, so the last challenge so the fourth challenge in the project is the backup And I think it was the most special challenge in this project So in this project stress systems towards end users data So we have to do everything that we can do for saving such data So as I explained it as one of the key feature of swift Swift is very durable for hardware failures But not perfectly durable for external incidents like application bag or operators mistake So to save data from these external incidents So we have to back up data and keep data at the Another independent external storage and the realize restoring data from that stretch Okay, so we decided to place backup storage in one of three sites and Take backup from distributed switch to cluster to gather the backup storage So it was not so easy thing and there are some points we have to consider So I will introduce some points we consider to build this backup system So okay, so I now show Some requirements in our solution for backup. So the first point we have to consider Scalability of the backup storage so the swift is a very scalable stress So the backups we should adapt to its scalability So because backup data grow up as data in swift goals So then we did that decided to use swift as a backup storage and focus on container thing Which is used for data synchronization between two with the clusters to realize backup data between two swift clusters The second requirement Requiremented point is a completeness. So swift works on the principle of eventual consistency And you cannot get complete backup from one part of swift to solve this problem We need to drive agent processes In all of the swift storage knows and find all data in swift cluster and get back up The third point is a reduction of the capacity. So now you have the three replicas in primary. So it is a little bit too expensive to have Similarly three replicas in backup storage so Then we reduce redundancy of object in backup stretch to one replica and we are as we back up features. So this So when the failure happens in the we backup storage we can back up some lost data again With such we back up feature. So the last one is a network consumption. So we as we show in the previous Slide so the primary swift cluster is distributed over two sites and we have to gather the data to one Gather the storage in one side So in our backup system we first back up data inside one sites and then check data over sites Okay, so we I showed Okay, so as a result we constructed backup system named container backup as I showed in this slide So container backup is developed based on container sync and it has an agent process in all storage knows and Check all data in swift cluster and the backup data to another swift cluster So we have managed process to control our agent and execute backup as a batch job So with this feature we can control the order of vision where agent works So first we start agent in the same sites as a backup swift cluster and transport all of object and then start Agent in the other side to check data in a backup storage and ensure completeness of the backup So now we solved all challenges or case or to ensure data We tested we did a recovery test in variety of failure situation to realize The second to realize a geographical distributed cluster. So we made constructed the showed geographically distributed cluster and tested it and For delivery and quality problem we constructed automated building and testing tools and Automatically build and test the swift cluster and finally for backups of me We made the backup system between two swift clusters based on container sync Okay, so here I show you detail of system construction. So we constructed Swiss six swift clusters three for primary and the three for backup and the primary swift clusters are distributed over three sides and has three petabyte capacity for each and backup cluster gathered in one side and has one petabyte capacity for each We also sits monitoring system and the visualizing system of system performance and resource consumption and We also said configuration manager in each sites Okay, so we these solutions we successfully finished our deployment project So we need we finished without any deathmatch and the keep on keep on time release So today storage system is working properly and I have now the latest data So today our swift cluster serves about 50 million get request 15,000 put request and the 500,000 delayed request for data Park cluster party and we have three swift clusters So they serve thought as much as a request as a whole and everything seems to work well But they are of course there are the small problem in that project. So today I will introduce three important example of them The first problem was about the behavior of swift. So in swift There are some processes on seminars and sometimes race condition Crushes between them causes error. For example, the faint client update Up the faint client send up trade request about object Just when object auditor starts to check the same object Then object auditor fail to touch the object and generate quarantine file And then that which end up with the system a lot. So we tested enormous Defeat cases, but it is very difficult to test all of these kind of race conditions the second problem is about global distribution so in the production environment we faced The situation that connection time out between proxy and the switch sometimes caused error Although we tested and ensured that the switch works with latency So then we checked one million TCP dump recalls with our eye and know to know why such timeout happens and found some loss of the same packets which causes the connection timeouts So in the production environment there are so many network Equipment between two nodes and some packet loss happens between these Equipments This packet loss are important thing and we have to We have to consider them when we construct the globally distributed swift cluster So as a solution for this problem we raise the timeout threshold to timeout of timeout the back and the back for some behavior from Habana These swift Habana release which makes swift to use handle of nose more effectively when connection timeout happens Okay, then the third problem is about performance tuning So we used cost effective service with no SSD But it's not easy to get effective performance using hardware with low IO performance To improve performance of a swift cluster we turn some parameters from two aspects the first one is about the sweat and process and we reduce the number of green threads and raise the number of process Process in our project server demo So green threads switches context effectively for network IO input output or doesn't switch is for disk IO So when you use hardware with low disk IO performance performance of disk often be a bottleneck of The whole performance and the greens with some time is stuck for disk IO processing So to about this stock we use process model and now schedule of swift effectively switches the context for disk IO the second point is speed parameter in back and process So there are some processes working on swift cluster and we can categorize them into two classes So one in the front-end process which works for the which works to serve for client requests Like proxy server or account server container server so such kind of server demons So the other is a back-end process which works to keep consistency of data in swift like Replicators or auditors and aptitude and so on. So in hardware with such a limited disk IO performance We should slow down the back-end process to keep effective resource for front-end processes So we checked back that all back-end processes and check the resource consuming consumption, so as a result they are two dominant resource consumer So the first one is the object auditor which mainly consumes the bandwidth of the disk IO and The other is db by applicator which mainly Consumers the IOPS of these Devices so we set this limited the speed of this kind of resource consuming processes to save Resources to for front-end process to keep Effective performance for client requests Okay, so now we finished everything so we completely satisfied for our customers requirements and solved all other problems So we raised system on time. So in addition to the success of that project There was some other good things. So the first one the We contributed the open-stock project with some patches to fix some bags bags We found in the in testing and I'm now I'm listed as a contributor in new knowledge And what is the most important? I got my session in this summit and now I can hear I can be here I can come to Paris Thank you Okay, so At the end of my presentation now you explain what we now think about our future vision So the first vision is about automation. So in this project So we constructed automated building and testing and we are now trying to make them more general one So in particular, there are many other things we should automate in testing like Recovery test and the race condition testing and so so but in particular Recovery to recovery test is so time-consuming because there are so many many test cases So we want now we are trying to automate this kind of testing I think the operation is also another thing we can do more with Automation We have more thing we can automate to effective operation the second vision is About we resource management. So in this project the performance training was One of the problem and we struggled to limit to resource consumption of back-end processes So now we have some plan to do such a limitation more intelligently So we're using something like seagrups which We can limit with which we can limit resource consumption Flexibly the third vision is about new feature So we used grizzly version in our project But there are some additional features in today's latest release like strange policy and Evasion coding and so on so we are very interested in them and have some plan for update or over solution to Realize more effective usage of capacity Okay, so then we have also another big vision about Swift about especially about product practical use of data. So switch can store Swift can be a strange and For huge data, but it is not so easy thing to use this huge data effectively So it becomes more and more difficult to find out It it becomes more and more difficult to find out data You we are in need as data in the strange growth up So now we are working on method it such in swift which integrates such features into swift So with this new feature we try to realize more effective efficient use of data within swift Okay Okay, that's all for my presentation. So if there are any questions about it I'd like to know what kind of parameters you try to monitor actually Monitoring about Services as well as the nodes. Okay, so I will show just a moment. So Okay, so these these slides shows about monitoring and the visualization system in in this position I didn't Let's talk about talk about this in detail So we constructed monitoring system with naguos and the visual visualizing system with ganglia and we tested all of the services on node in with monitoring system and the visualizing system to check if the all of the notes with nodes Working properly and in addition to this we also check the service so service of the service through proxy server and we also check the Swift cluster Works properly as a whole Object servers The proxy server from the object servers, right? You how did you distribute? Yeah, so Yes, okay So this knows are distributed. So we have two strategies for monitoring them So we said see monitoring So by in three sites and then check each of them checking on the node in the same site and also we to check the whole function of the swift we also use the Monitoring system in near to the proxy server and send the request to the proxy server to check the whole function So you separated the proxy from the Account container and an object So, how did you deploy account container and object what is the ratio between the proxy and the object servers? Did you deploy account container object together or did you deploy proxy? Did you reply with the proxy? What was the configuration you used for that? Okay, so now your question is we have two proxy servers so But I confused for use two of these process of us, right? Okay, okay, so We show the proxy and the strange figured in this Slide and in proxy Proxy server demon is walking and the other like our accounts of our Container server or object servers are walking in the strange. Is that enough for your question? Okay, okay, I So here Okay So I have a question about your backup use case. So you mentioned that one of the use cases is to protect against Unintentional errors from client right accidentally overwriting or accidentally deleting some of the objects in your swift cluster and for Disaster recovery you seem to have a multi regional setup where you actually replicate data cross So for these kinds of bugs I'm curious as to how container sync is addressing it because container sync could actually Propagate all the errors from your primary swift cluster if you have actually deleted or overwritten the data with something else There is something like versioning Could be a feature that could be useful. So it's one. I'm curious as to why you chose container sync or some of the other features Ah Okay, so the main reason why we chose the container sync because we can Make the backups with cluster independently independently to the mainly means with cluster So often we've noticed that some bugs are happening in the applications So we all we have to do is to stop the container backup demon So we run the container backup demon in each day and all we have to do is stop the job for the backup So it is very easy way to keep the data in the backup storage. So we choose The container sync and the independent backup storage to take the backup of the data Okay, so so you are relying on your ability to detect these errors timely. Yeah in a timely fashion. Yeah, thank you so If you any other questions, okay, so that's all. Thank you for coming to the session