 Hello, everyone. Thank you for coming. I'm Jasmine, and I come from China. And I'm the project manager of UM Cloud. And today, I'm very honored to have this opportunity to represent my company to talk about some case story for our customer, Balian Group. And before we get started, please let me tell some background of this case. First is about UM Cloud. And UM Cloud is the JV company of Meritys and UM Cloud in China. And this predecessor was Meritys China. And our customer, Balian Group. And Balian Group is China's largest retailer with more than 6,000 department stores, supermarkets, and outlets across 25 cities and provinces across in China. And it is called China's Walmart. And like other FastGoes geography, however, China's retail business began transfer from brick and mortar to online commerce. And in China, there are so many online competitors, such as Jindong, Sunning, and Alibaba. And they have unveiled the new e-commerce market site, so an occupied, large market share. So faced with matching challenges and the opportunities, Balian Group set aggressive goals to build a new technology platform that would promote development while streamlining the operation for online payments and business intelligence and the loyalty programs. So from 2014, Balian Group accelerated its multi-channel saving efforts. And established Balian Omni-Channel e-commerce company and with goals to build the industry's largest offline to online commerce cloud platform and increase Balian market share. So during this process, Balian team was aware that the low server utilization and long provisioning time and the high cost of operation for the hundreds of applications and thousands of bare metals that would impede their Omni-Channel sales goals. So to do so, Balian Group needed a flexible and cost-effective and scalable cloud platform to replace the existing complex IT infrastructure. And today I have two partners, Chitron and Julian. Chitron is from Balian Group. He is the architect of this cloud platform. And Julian is my colleague. So next, they will give us the detailed presentation about this case. Thank you. OK, thanks, Julian. Hello, everyone. My name is Chitron Lu. I'm from Balian Mutant Channel. Balian Mutant Channel is a sub-company of Balian Group. I'm the architect of the Balian Cloud project. Today I will share some experience about the Balian Overstack project. Three years ago, Balian Group decided to set up a new company to integrate the online business and offline business. In the last two years, we have developed many business services, such as Shopping Guide and Screen in our Shopping Mall, and the Fresh Order, Home Delivery, Virtual Shops, Living Cooking Workshop, Integrated Payment, Self-Pickup Cabinets, and Parking System. All of these services are designed for our daily life. And the last two years, we also invested a lot of main power to develop an online system, like the mobile apps, PC web, and H5 web. Especially on our mobile apps, customers cannot only shopping. They can do many other things, like today, running outside. If I forget my umbrella at home, I can borrow one from Balian's market. And the next day, I can return to another one. We also developed a monitor screen by the big data system. How kind of a cloud are we building? To meet these system requirements, we need a large platform, so we think we need a cloud for solving this problem. The first problem we find, we have so many systems and apps we need to build. And there are hundreds of applications, but our resources are limited. And the time is very little. We need a platform that can deploy hundreds of applications in a few times. And there are different kinds of developer language we used. Such as Java, Python, Go, and PHP. We also need numerous servers to support the promotion system and the second queue system in a short term. We asked the cloud should meet different requirements. The first one is numerous. Like I said, Balian has so many apps, so we need a platform to supply numerous systems or containers to support these apps faster. Faster means that the platform should supply the resource so fast. And it can faster than physical servers or standalone virtual systems. It's more economical. We find another problem in our system. We, at the most time, our system is very idle, so that's very, very wasteful. So we think the cloud can help us to increase the system utilization and save power. The last one is the most important, the stability. Our business is combined with the online business and offline. If our system is down, there will occur many other problems. For example, if the payment system is down, the people cannot pay in the supermarket or the shopping mall of the Balian group. I think that will be very bad. Next is our company's cloud vision and strategy. From 2015 August, Balian researched more than 20 private cloud vendors in six months. And we have experienced more than 200 meetings for discussing how to build a cloud system. And then we decided to cooperate with UM Cloud. Before 15 months ago, our systems still be in the bare mental age. Now we have in the commercial too. In the future, we think we were in the version three. This year, we have built more than 300 nodes and more than 5,000 VMs in our cloud platform. This is only our base system. We haven't combined with the CI-CD system with the cloud. And the next year, our key work will focus on the SOA, macro service, RPC management. And we will develop some plug-ins on the cloud for a covered CI-CD system. In the future, Balian will run more apps. This will up to million apps. And we will build our SaaS platform on the eyes and past platform. Cloud does not bring us technology promotion. It also brings us some problems. In cloud, we can get resources so easy, like VMs, networks, and storage. Even we can build a whole development or production environment only by clicking mouse in a few minutes. So that's very easy and convenient. And then we need a new process for processing this requirement. So we also should change our engineers' way of thinking. Because our resource is also be limited. And although we have hundreds of nodes, but our resource is not enough to cover all of our apps, we need to set up new standards for resource utilization rate for fine management. We also, when we choose a cloud, we think the cloud need have following features. So looking, our company is a technology company. So we need to learn how to develop a cloud platform. We need to know the principle and the source code. And then we can develop new features by ourselves. Our business has so many apps, so we also ask the platform can be flexible. It should have kinds of plugins to meet different requirements. In summary, we choose OpenStack. Because the OpenStack is version, open source, advanced, and strong flexibility. OpenStack can also bring us other benefits, such as scalability, friendly interface, easy deployment, and rapid development. Next, let's welcome Julian. He will introduce the solution for us. Hello. Hello, everyone. So thanks for Chitron's sharing. So it's my turn to share this whole solution to you guys. First of all, I introduce myself. So my name is Julian. Julian Wong. So I'm the principal architect at UM Cloud now. UM Cloud is a local branch of the Mirantys in China. So I also served Mirantys, Canonical, and Nokia before. I worked more than six years in the OpenStack area, and especially focused on the architect design and the solution delivery. OK, I'm very glad to have this opportunity to share the solution to all you guys here. OK, let's see what our solution looks like. OK, this is our solution team, actually. Our solution team actually involved more people than this photo, but this photo is the core team. So it includes very experienced engineers, developers, architects, of course, project manager and customer himself. So with this team, we managed to build this cloud. OK, if you remember just Chitron said, so on their roadmap, there are three phases or three stages, and we are just on the first stage. The first stage target is to build a cloud with 300 nodes and running maximum 5,000 VMs on it. And by different purpose use, it will divide into five different environments. So for production environment, we all know that it's running the production workloads and provide a service to the public. And the pre-production environment is a low ass to test the applications before it go real production phase. It's just for dry run or something like that. A developed environment is for the development team. So they can use this environment to taste their code change. And of course, we did some tricks and let their code commit will trigger the automatic deployment so it can easily test if their code change is working or not. The next one is a testing environment. The testing environment is for the testing team, of course. So every development team has one module, and the test team can integrate all these modules together as a whole application and do some functionality or compatibility test in this environment. The last one is the application environment. So we provide easily deployed applications, for example, like the RedBimQ, or ZooKeeper, Redis, or PostgreSQL, something like that. So in the user can use this environment to easily get the application they want. Oh, what's going on here? OK, let's get this. So at the beginning, I'm not sure if you can see it very clearly. So at the beginning of the cloud build, then we have the ADA session, which is called the Architecture Design Assessment. So during this session, we will understand the customer requirements, their pain points, the problems, or challenges, and also what workloads we are running on this cloud. For example, a typical application in Balian is combined with web services, database services, and message queue. And also, the operation, we need to understand how they operate this cloud currently, because we all know that the operation process cannot be easily changed in a short time. So in the first phase of this cloud build, then we need to adapt to their operation process. So after understanding what the customer wants in the cloud, then we set four design specifications just fully matched what Chitron said they require in the cloud. So first one is a dynamic scalable, means this cloud has two different parts. So one is a hardware resource expansion. So we can add more nodes online, like adding the Nova Compute nodes, or a Seth node, to add the capacity into this cloud. And the other part is for the application level, application can auto-scaling based on the utilization. And this is a dynamic scalable. And the second one, for the hardware availability, it's because we want a stable infrastructure. So we need a hardware solution fully covered every component of the storage networking, or an open stack itself. The third one is a DevOps. So the problem is the application deploying speed is too slow. And their development team cannot test their code easily. So the DevOps solution is just our specification to solve that. And the last one is the easy operation. We need to follow their current operation process and to help them to easily to use this cloud. OK, this picture shows the cloud integration architecture. And this is very typical one. Just at the bottom is a hardware. And above that is open stack, then pass layer, then the product. You can see each product, because the POD has different modules, and each modules are made by a different team. So it's very complex. And at the top layer, there is DNS, sitting, and the internet access. And of course, in the next phase, we'll connect to the public cloud as well. This is the open stack design. So we choose the Marinus OpenStack 8.0. It's an open stack Liberty version. And I just want to highlight the few because Chitran said that I didn't want to expand their cloud to adding more nodes to get a flexible flexibility for this cloud. So few can automatically find the new nodes and install operating system and config open stack and adding them to resource port to get more capacity. So few help us to build a seed cloud starting from 30 nodes and adding more and more nodes in there and finally get 300 clouds. For the network, so we just see one POD means one product. And one product combined of the different modules, these modules are created and managed by different team. This team belongs to different tenants. So the requirement is that tenants belongs to same POD, need to access each other or otherwise should be isolated. So we all know that through the public IP or floating IP, this can be done. But through the net, the performance is bad. So we're just using the simple way. We're using the provider network, neutral and priority network. So external switch will handle this, it will take control whether tenants can talk to each other or not. And also it's proven in the 300 cloud, the Neutron Air 3 get away to be a bottleneck because now external switch takes control, so there's no bottleneck at all. This is storage, picture the storage design. We have multiple storage back end. We provide the object storage. We have the safe cluster. We have the same storage. And for self storage, we build the two pools, the SSD high performance pool and the SATA regular capacity pool. This picture shows the HA, that's a very typical solution. All the open stack services are handled by the VIP plus HAProxy plus the crossing pacemaker cluster. And the MySQL and the WebMQ are using their native clustering solution. Besides of that, we also covered the networking using the bounding network bounding redundant. And also the storage, the safe has their own replication solution. And we can see in the whole solution, we don't have any single-pointer fiddler at the QA phase. This is the application solution. We're using the Morano module. Morano is an application store, so we can use it to rapidly deploy applications for the users. And also we can help the binding group to customize the Morano applications and create a new one for their commercial use. And because the Morano calls the heat engine, so it natively supports the auto scaling for the applications. It's all fully matched to their requirements. Also, using a few integrated with Jackins, then we can move this towards to the DevOps so it can do the fully lifecycle management in their development phase. And also we just mentioned that we plan to use a microservice to replace current native applications. And we're using the Docker and the Kubernetes as a solution. So Kubernetes can be automatically developed by Morano and set up the pooled environment and expose the API. So OpenStack works as an underlying infrastructure, also provide a container as the capability of the multi-tenants, networking, security, and auto scaling. So in future, once we want to connect to the public cloud, it's also benefit from the stalker and the Kubernetes because they easily connect to Google GCE, AWS, or other public cloud. To adapt to their operating process, the way many changes. I take some example. For example, some of their old applications running on the bare mental, they need a fixed IP, like the first IP in the segment, to need to be assigned to a database. But we all know OpenStack assigned automatically, so we hack this and put it in the user to specify the fixed IP during the launch. Also, their operation team has many different roles. Each role can do only their own things, like the network guy only can touch the network settings. So we fully followed their process, so we create many different roles, you can see on the picture, and the network reader can only see or check the settings, and network owner can change it, and the tenant owner can launch incidents but cannot touch network settings. To better use the multiple backend of the storage, then we also hack something. So allow you choose which backend to use while launch the instance. Okay, after everything deployed, then we did this list of the tests to verify how the whole thing is working. Yeah, so like Tempest, they have the 1,400 cases covered all the API functionalities, and Rally is well-known as a performance testing tool. Shaker is network performance testing tool, and Cosbench tested the performance of the object storage. So we also did the HM menu test to test like destroy one controller node and recover it, something like that. And to better operation, we provide this stack-light solution as a monitoring, logging, and alerting. I want to highlight this alerting, so negative system we integrate with the email server and the message gateway. So once any problems occurs in the buy-in environment, then it will send the email and the short message to buy-in's operating team. After all of these things done, we transfer this environment to customer, and we train them on the architecture design system and how to develop with the OpenStack API and how to hack the OpenStack code itself and how to operate it. Okay, so now it's a 300-involvement, node-involvement, and in the future, we'll cross multiple regions and with the unified portal to management and unified authentication, this is the next step. Okay, that's all. So, any questions? Yeah, please. You mean in the current or in the? In the current? Yeah, current, the L2 is handled by the OVS, so neutral plus OVS just handle the L2. I mean, the L2 traffic and all the L3 traffic is handled by external switches. Yeah, so we're using the VLAN, so every VLAN inside of this VLAN is handled by the OVS and you can create your OpenSubnet, it's also SDN, but if you go to L3 through a gateway to external, then this is handled by external switch. All the traffic handled by external switch. Okay, is that answer your question? Yeah, we are using, yeah. Just currently using the OpenWay switch. OpenWay switch? A native one, right? ML2 plus OpenWay switch. Now it's VLAN, no VXLan. Okay. Yeah, that's the next phase, we cross multiple regions, then we will enable the VXLan and maybe we will integrate with SDN solution like Juniper Control or Cisco ACI, right? Okay, any other questions? Okay, that's enough here. Okay, thank you, thank you everyone.