 Hello everyone, my name is Liu Junwei, technical director of China Mobile Suzhou Research Institute. Today with my co-worker Luo Gangyi, who is the elastic computing team leader, we will share the topic performance analysis in the large-scale development. It's a real scenario in our public cloud. Let's discuss according the agenda. First of all, allow me to make a simple introduction about my company. China Mobile Suzhou Research Institute is responsible for development of cloud computing and big data products. It's total investment of more than 3 billion and it has now 300 employees. This slide we let's talk about the deployment architecture of China Mobile Public Cloud. So far there have been two data centers in Guangzhou and Beijing separately. We developed a product called BCOP to unify the management and operation among all data centers. Now each data center contains more than 2,000 physical nodes in one, in two more, 2,000 physical nodes in a data center. Given OpenStack scalability issue with splitter data center into two regions, 1,000 nodes in a region. We also divided nodes into different zones. You can see there are high performance zones, low performance zones, and windows zones. According to different hardware or gas OS type, it's notable that there are heterogeneous regions in Guangzhou data center and OpenStack region and OpenNebula region. OpenNebula is another kind of computer platform. We built it from 2009. We used 34 controlling nodes to manage all the computing nodes, about 650. One region is total 1,000 nodes. In one region, we share Keystone service, MQ and DB service. Next, our crew worker will talk about the most important parts, analysis performance results. Okay, I will talk about the analysis of performance on our computing platform. First, let me clear that all the analysis I will show is based on the original neutral, but actually we are using the SDN provider. We were using the neutral by a third party SDN provider, but our SDN provider has some problems. They do not give us a feasible product, so we have to only test it of the original neutral. In the earlier morning, the guys of Intel said they start 5,000 virtual machines in 40 seconds, so I'm wondering how they do that. Did they have some black magic? I don't know, but in our practice, we start 1,000 VM, we will close more than five minutes. Actually, I'm not surprised why it will be so slow because the whole procedure of starting a virtual machine is very long. It will take about hundreds of database access and tens of message exchange of RabbitMQ and tens of API access, so it's very long. When we finished set up our cloud environment, we first did some concurrency tests, and these four figures show the result of our concurrency test. You can see in the left bottom figure, let me clear that the red means failure and the blue means success, so you can see that in 768 concurrency, there are many failures, and in 2024 concurrency, more than half of the VM started is failure. So what is the problem? We think the problem is from three aspects. First is the API limits. We have seen a lot of, cannot establish a new connection in Nova API logs, in Neutral Server logs, and the second one is database limits. We have also seen a lot of, cannot allocate a connection from database pool, and we have seen so many database data logs, and the third one is message queue limits. Also, we find all these in the logs, like the connection pool reaches capacity and the timeout on waiting for reply, and the last we have some, we have found some OSIO messaging bugs, and we have the hash from Rantys, so okay, thanks Rantys. To solve the API limits is very easy, we can just increase the amount of API workers, or just add more API nodes. Also, we can use a model WSGI and HTTPD instead of Python service. Database is a hard problem. Firstly, we look into the max TPS of database in concurrency test. The analysis showed in the left figure. We can find that when concurrency up to 128, the max TPS reaches 6,000, and when we do the concurrency of 256 or 512, we find the max TPS do not increase anymore, so we can assume that 6,000 of TPS is our database limit. The right figure shows the timeline of the whole 500 and 512 concurrency. You can see that almost all the time, the TPS reaches more than 5,000, so the pressure of database is very high. How to solve this problem? First, we let Neutral and Nova use the same database, so the easy thing we can do is just separate them, let Nova use the database, they are long, and then let Neutral use his own database. And I think we can try SSD because SSD is faster than normal SAS. MQ is another problem. First, let's look at the max message delivery rate of database in concurrency test. The analysis shows in left figure. We can see that the delivery rate increases exponentially, so that's quite normal and do not show NSL in yet. But the right figure is downloaded from the RabbitMQ's official website. We can see from the right figure that if it's true, 40,000 message per second is selling, and in the left figure, we can see that when we have 1024 concurrency, it's almost reached more than 35,000. So I assume that actually MQ is close to its limit. So what can we do of the MQ? So what I can think is, like I write on the bottom of the slide, so we can increase the size of connection port, we can separate Nova and Neutral use different MQ, we can increase the waiting time, and also we can use SSD. When we separate the database, we have to increase the worker of API, we increase the waiting time, and we fix some bugs. Then we succeed in 1024 concurrency. Actually, I have done more concurrency tests, but since our Neutron is well not used in product, so I think that don't do not have many, do not have many. So I didn't get the result here. Okay, then let's see another part, the bottleneck of the monitoring architecture. So that's, the figure is our monitoring architecture. We use Cereometer, Nokia, Inflex DB. We gathered the performance data from Cereometer compute and send them to Cereometer collector by using UDP, and the collector had to send metric to Nokia, and then Nokia stores the meter to the inflex DB. But this architecture has some performance issue, so let's do a math. If we have sampling in 16 seconds, if we have 25 metric, and we have 10,000 virtual machines, so that means we should stop 18,333 meters per second. So yes, we have run this test and we find that I have six Cereometer collector service servers. All of them, the CPU utilization is larger than 80%, and I have six Nokia servers. Also, the CPU utilization is larger than 80%. And we have monitored the MQ, other method code. We find the TPS is larger than 8,000, and the traffic of atrial proxy and LBS is more than 200 megabytes per second. So that's quite resourceful. So what can we do to improve them? I don't know. Maybe we can add some cache of Nokia, so it will not need to access the index database every time. He can find the indexer in cache, and maybe we can use SSD of inflex DB. Maybe the performance can grow. Okay, that's my part. This slide, we will talk about our future roadmap in our private cloud. Our private cloud is larger north, about 6,000 physical nodes in two data centers. HD center contains 3,000 nodes. And we will use SDN solution based hardware devices. I think in China, it's first time in private cloud at large scale. And we will use bare mental service based ironic project. It's the first time. And the most difficult is ironic we are integrated with SDN solution. And the SDN solution based hardware devices, I think it is very difficult. And we will use fireshare service based manual project. It's also the first time. So our private cloud will be online at 2017. So in the little, at the little six months, we will summarize the experiences and look forward to share with open stackers in Bacaluna. Thank you. You can use the microphone behind you. What was your network architecture? In the neutron side, was it L3H or were you using DVR or how was your network? Yes, I describe first. Actually, in my test, we use the native neutron. We use VLAN and we do not use DVR. We do not use VXLAN. But actually, in our product environment, we use the SDN provider, the third party SDN providers, the product. They are solutions used DVR. Okay. And one more question related to kilometer. The rabbit bus, was it separated out from your control plane or did you use a separate rabbit bus for kilometer or it's all same single rabbit notification bus? Okay. We do not use MQ. We use UDP directly. Okay. Thank you. I don't understand. Nova cells for the control plane. No, we do not use Nova cell. I think we cannot sit here. You said you use Nokia cache and you were planning to increase the capacity. How much you had originally and what your plan to increase? I think, you know, I find in the website of Infrax DB and it says if we use SSD as the back end, it could store 15,000 meters per second, more than 15,000 meters per second. So I think if Infrax DB can get this performance with Nokia and a centimeter with Nokia should also reach this performance. Did you guys mention what kind of storage you're using? You said 250 storage nodes. What storage is it? Okay. Actually, we use the ship dock. Okay. Thanks. Yes, we have it. Oh, yes, it's the local VM. Sorry. Yes, we just, so that's not a production test. We just want to test the, you know, the performance of every single project. So we just use local VM to do the test. We do not have seen, yes, we have cache. Do you have any problem? Any question? Yes. In the RabbitMQ variation, have you checked which of the projects issued most messages like Nova or Neutron or Keystone? Well, I'm definitely sure Keystone will not issue messages to MQ. And actually, in our product environment, we, Nova and Neutron use the different MQ. And I didn't compare them, but I think maybe that's quite, quite equal. Yes, I use Rabbit. Yes. Yes, I use MiradQ. My additional question is, so now you mentioned, so you are using the MiradQ option. Yes. And also my next question is, so is that the active, active style? Yes, I use active, active. Yes. Did you chart your cloud? Did you share, did you divide the cloud? Did you scale 1024 as a single cloud or did you shard them? Shard from. You didn't use shell? Oh, yes, we have two regions. Two regions, yes. And then we share a Keystone. We share one Keystone. Just the Keystone is federated, the rest of them are divided into two regions. Yes. Will you be publishing your results anywhere with the testing and all of your data? Will you be sharing that with the community or will you just be? Okay, I think I can share this data. Thank you. So thank you.