 Hello everyone, I'm the development engineer Shi Hai-Kong from 2.0.com and I want to show you how we operate the biggest retext classes in the world. Next, I'm going to introduce from what I'm first, why we choose retext. Second, how we operate retext in JD.com and my colleague Shi Jin-Ko is going to introduce all the plans we use in our future plan. First, why we choose retext? First, what is retext? Retext is a database clustering system for horizontal scaling of my SQL. So what is working on is working on dividing the database, but not just dividing database. This is the structural map of retext. And we can see the application is application nodes. It could be several nodes. And through the MyCircle protocol to connect to other nodes. And the application is going to send a query to the gateways. And gateways are going to dispatch to the other DNS. And there are also key nodes using the ID as the... And originally the table was used in a different way. And now we can divide these keys, utilizing the ID protocols and putting them into three databases. And in this way we're able to divide the database in different parts. And including the circle and also the tablet, we can connect the shot. And in this database there are three shots. And in the original key tablets, we can put it into three nodes. And we're also achieving the division of the original database. And also there is a typology services as well, including all those... All the tables are going to divide it based on different fields. And all of these will be recorded into typology. And also we have other tools for operation. There were VDC TL and also VT-CTLD. It's providing the external functions originally. For example, I have originally some shots already. I want to add some extra shots. However, I want to change for the other nodes. For example, I want to download or upload nodes. Then in this way I can use the VT-CTLD to operate it. So this is the major structure of VT-CTLD. It's mainly for the dividing of the database. But it's the own dividing of this database. And we have several concerns for... It's because we have several factors to consider when we choose the matrix. There were several characteristics versus my circle. It's well put inside JD.com. And also we have a lot of experience in operating my circle. So that's why we choose these database operations methods. And we will be very competent in this fundamental layer of it. Because inside JD.com there are a lot of other services using my circle database. And now we want to migrate it. We are using my circle. And it just needs a little bit of modification. So the cost of migration is very low. It means I have a lot of data in some database. It may be more than billions. But now I want to put this data on a big data platform to do some analysis or downstream operations. Usually if we have just a very small amount of data we will just use it as that. And then use some of the nodes on our gateways. And then this will be shown on our big data platforms. However when we have a lot of data. For example it's more than billions. Then I will put all this data into the nodes. Gateways. And this will cause some problem within it. With this big amount of data I will just read a lot of it to the gateways and put it on the platform. And then later I'm going to write it on the big data platform. It's not utilizing the streaming operations. And of course the streaming operations means the whole train should be floating and there should not be any blockage at any of these nodes. So all of these nodes should be able to manage these kind of stream operations. And next is the sharing and two face commit. And also the secondary index. And for the next three parts I'm going to talk about it in details. Resharing is doing the data segregation. For example originally we have two shards but there are too many data so I need to divide it into four shards. So we are able to use resharing to do that. And next is the two face commit. And it can ensure it's atomic commits and also to avoid any of the replications during the commits. This is also one of the highlights of the WinTech functions. First is resharing. And we have separated several steps for resharing. First it's copying the data. We can take it as one shard. And within one shard there are the master and also the replicat. You can see the structure here. I want to take one of the replicat from the connection. So this connection has been disconnected. So there is no relation between these two. So this is a drain tier. It's a snapshot data. Originally we have table and we use the id to divide it. And I want to take very simple rules on the sharding, on the segregation. So I want to disassemble them based on id rules and also utilizing the stream query to check it and then put it into this shard. With this id I will put in this shard. Without it I will put in the replicat. So in this way we can deal with the screenshot data. And during the copying process the master data is always writing in. It means that there is some of the delay between the master and the replicat. So I want to replace this replicat because there is some delay of the data. It's going to synchronize some of the bin locks. And Winters are able to synchronize these bin locks and then to analyze it and convert it into a circle. And all of this writing and updating will be synchronized to the shards that we have decided to. And it will relate to the operations of the whole clusters. So without this master and replicat relationship the two parts should be synchronized. It will have some problems. So what should we do? And then the original shards should stop service and we are going to use two new shards to work on it. And originally it's routing to these shards and now it's routing to the other two shards. It just needs to use several seconds. So the resharding just affecting the whole clusters by just a few seconds. This is the sharding system. We have been working on this resharding process for many times already inside JD.com. And here is another Adavitas transactions. We have two things to commit. And I will first focus on multi-shards and there is one question that is debated from the multi-shards. It's too simple, it's a little scary. It's also using the ID for the sharding. If it's less than 512, if it's less than 512, I'm going to put it into shards 0 and for those that are not, I will put it into shard 1. And also I will have another terminal and I will provide a quick verification. So for those ID less than 512, I will put it into shard 0. If it's 900, it means it's larger than 512, so I will put it into shard 1. If it's 400, I will put it, because it's less than 512, and at last I'm going to do the commit operations. And shard 1 has been succeeded in commit. However, the shard 0 maybe have some failure lower and also have some weird or abnormality. And when the shard 1 doing some commit, it means that it will be unsuccessful. It will cause some hold the rollback. So we will see some of the problems from the, it's because of the terminal, the client's terminal that makes this failure. And shard 1 is successful, but shard 0 is failure. So however, overall it has caused a failure of the commit. But you still need to see the scenario of this. If you cannot accept partly commit, then there were two results you can adapt. First, a single shard. All of the data can only be included into one shard. You can put it in shard 1 or shard 0. If this has some failure, then it will tell the clients and also to record the error. And second is to face commit. The second way to solve this problem is to face commit. It means all of these operations will first recorded on the locks, on the cloud blocks. And then before it is submitted, the machine have some problem and we need to wait for the reboot or the recovery. Then it's going to get into the re-block face and it's going to implement this result. And later we will find that data has been committed. So the three re-blocks are going to report this error again. So this is the framework to use by DTX. We also want to meet the scenario. First, if TBS is very high, this is an example. And also we use the KAS. But there is some problem with it. The KAS think it has some problems. And Kubernetes will reboot these nodes and always reboot again. And because it's TBS, it's very high. So that will cause the partly commit problems. But it will be soft-fired to face commit. Of course, the Kubernetes we should deal with the bugs on the KAS. And the second of the index. So what is the function of this index? On VT gates, they will use the VT table as its routing. If ID from 1, 2, 10 put in shot to 0 and for the other part put in shot 1 and for shot 2. And we're going to make it for another part of the name ID. So ID is 1 and 5. And they tell I insert and the ID is 11. And the name is Lisa. The two names are both Lisa, but it has been put in two shots. However, when I do the inquiry based on the ID, for example, then it's going to say the ID is 5. Then it's going to route it to shot 0. Because I use ID for shouting. And then it's going to route me to the shot 0. Sometimes I cannot provide this ID. For example, I can only use name to do the query. Then if the name is Lisa, if we use the name for routing, then what should we do? On all of these four shots, we are going to inquiry all of these four shots once. Although it did not find it on shot 2 and shot 3. And we still need to check it. So this enlarges the reading. So how do fators cope with it to avoid the enlargement upon insertion the ID equals 5? I also saved the correspondence relationship between Lisa and 5. So when Lisa equals 1, then I can check the result very quickly. So I'll search it in this table first, where Lisa is 1. So we have the ID 5 and ID 11. And I only need to search further in shot 0 or shot 1. So I don't need to check all four shots. So I have to avoid unnecessary reading and access. If there are four shots or even 200 shots, then there will be very significant latency observed. Second, I like to talk about how we run fators in JD. This is our internal architecture. So through MySQL protocols, it's connected with our VT gate. And VT gate through TRPC to connect with our shots. The VT shots encompasses the VT shot and VT tablet, two components. And the feedback queries, we feed back eventually to the apps and VT gate through the watch. My mechanism, we're using ETCD to enable the original data storage. And VT data have a watch mechanism which will real-time watch the root information and other changes in original data so that the data can be synchronized with our shots to enable a more efficient synchronizing. And many JD businesses have internal memory requirements based on our MySQL cooperation for two TV business, for example. For some scenarios, we need to real-time synchronize the data on our big data platform by extracting the BING lock to synchronize it with the big data platform. In JD, we have a BING lock service. We'll real-time synchronize the data through BING lake. And the BING lake will offer subscription services for the users so as to enable real-time synchronization of data. In JD, there are four key systems. One of them is still under development, which is a backup system. Real-time backup is still under development. The third one is already being applied online on a massive scale. And I will focus on the three services. First one is JTransfer. What does it mean? So many of our previous services we use MySQL directly and now. The VATES is online, so we are switching to VATES to transfer some of our services. So we real-time extract and synchronize the BING lake data. It is only limited to this stage to enable real-time synchronize. And there is no master replica latency. It is similar to what I have described. So this business module used to visit MySQL module. Now it only needs to visit the new module. They only need to restart the services. It used to connect with MySQL cluster, now switching to VATES cluster. So this is what we do for Transfer. It is mainly used for the key accounts of big businesses in JD where there used to be many nodes, many data. So in this way, we don't have to consider the storage space. In the past, maybe there are 32 charts and we have to expand the key space by ourselves. It is very difficult and time-consuming. Now we are using VATES key space. There are no problems for storage at all. So this is what we use internally. It is also a part of VATES component. It is called an orchestrator. It is an open source to have. We call it ONSC. So on the left side is the ORSC cluster. On the right side is VATES cluster. The orchestrator cluster will regularly, like every 10 seconds, retrieve the data from all nodes, MySQL nodes. There are master replicas, data, latency. And after the retrieving, this data will be moved to the internal storage. And after returning, it will conduct analysis just to tell whether there are some problems between the master replica latency. If there is some problem with the master replica system, then it will select a system with the least latency to switch it. And also, if this master replica is dead, then it will trigger a switching to another master. So no manual intervention is required in the process. Another component used internally is our MOLE. So for our business launching, there are two steps. The first step is the request. So during request, the data warehouse resources will be allocated, for example, how much CPU, how much internal memory. So that's normally what we do. So after the request, the project manager will approve on the request. And after the approval, based on JD's resources, there will be a DBA approval. And after that, there will be an online database will be created. The way we create the table is similar, including the request table, approval table, and DBA approval based on their own professional knowledge to evaluate whether the table's creation has some risk factors, whether it's compliant with the regulations. And after that, the online table will be created. And after the creation, there will be some more interfaces in KBS to apply for resources, scheduling, etc. So the process by creating the table is not too difficult. It's just a database creation. But to change the structure of the table is not that easy. For our R&D people, they have some knowledge of database, but their understanding is somewhat different from ours. We know that for some business requests, there are no primary keys. This is what we are not recommending. Or maybe the primary key is not auto-included. We would recommend the auto-inc function. Or maybe there are some deprecated storage engine or deprecated characters. Or maybe he has created a table and created another one. But the table actually already exists and there's some data in the table as well. So now they are using a dump sequence to delete it. But this is a risky behavior. And there are some even lower mistakes like a syntax error. These are common problems and unavoidable. Next, some of the users, they don't use VATIS. And based on VATIS, we can do some interesting things just like I mentioned by creating the table. There might be some issues, for example, a syntax error. VATIS has one advantage. There's a very good modular based system with very low coupling level. So we don't need to change the modules, even if there are some syntax errors in some modules but because of the low coupling level. For example, I can introduce the tabular module to create a tabular. If there's some syntax error, then I can tell the business department that there's a syntax error in the SQL. So there's no need for PM and TBA approval to identify the error and eventually resubmitting the request. And some of the tables, primary keys and other restraints, like if there's no syntax error in a SQL after analysis, then I can acquire the name of the table, the least information, the primary key information. All this information can be acquired. By then, similarly, I can execute it online a short table sequence. And then I'll look like whether there's a table online. If yes, then I'll tell R&D that your table already exists. No need to submit again or submit the SQL again. Or if the table does not exist, then following these rules, I can examine it and verify these master replica information or whether these SQLs are compliant. It's very easy to deliver these functions. Next, I'd like to, you know, for people interested in beta tests or potential users of beta tests, some advices. My first advice is based on the example I mentioned, the reusing of a beta code. Even if we don't use beta services, we can still use beta code for some of our in-house development. For example, the puzzle module. Actually, in China, there are many teams are doing that using the puzzle module of beta tests. Besides, while doing resharding, they will first extract that bing lock and translate it into MySQL. So all these related modules can be easily accessed and utilized. And also some of the existing MySQL modules, they are not coupling with the business logic of beta tests. And second, let's beta run at the lowest R&D cost because it's based on Valarities. And it's a database and cloud solution. One friend of mine, he's using betas and he tried to run it in an official way. But some of the big teams have a QBS experience and data center establishment experience. But some smaller teams, they have never used a QBS. So let's take that as step. Betas can run QBS, but the first step should be on some of our existing Docker or physical servers to run beta tests first. This is easy to do. And after that, we can provide online production services. And the next step would be further development, integrity development. Especially for OEM, the code may have some bugs. And the operation maintenance may be mishandled. But don't let a problem affect all your services. Especially if the service is big, like hundreds of businesses sharing the same module. Then if one procedure is mishandled and it affects all the modules, it will be disastrous. Now, my fourth advice is to split large clusters into multiple small clusters. Now, JD is using the largest beta cluster in the world. If you are taking the lead, you will encounter problems that no one else has encountered. If you don't want to invest in R&D to solve the problems, split your large clusters into multiple smaller ones to avoid unexpected issues. Later on, my colleague Mr. Jinka will share problems and solutions. Good morning. Mr. Haihua has shared the first and second part and the third and fourth part because of the time restraint, I will quickly go through them. I will talk about some of the problems and solutions JD has encountered and adopted. And next, I will share our future plans. I will start with challenges encountered. JD is operating, fair to say, the largest or the largest in China, VTES cluster. Probably online, we have over 20,000 EPS and around 3,000 EPS. And we have the largest VTK cluster around 400. So in operation, running, we have encountered many problems. So first of all, it's the various demands. And some will say that we have a lot of things that are not supporting. And after we find out these problems, we are going to manage all of these demands. And also the influence between different apps because it's the shared cluster, so they want to have some priorities in this cluster. So during our operations, we also expected that there will be some problem happen in our topology services. For example, the metadata is too big. Sometimes it will have some OAM. So when we commit to the gateway because we have a lot of users, so how can we upgrade these VT gates, et cetera, it's all challenges for us. First of all, I'm going to tell you how we solve these problems. It started from the very demand because it's based on the VTX 2.0 version. And we also have some custom modifications on JD and define some of the look and prepare protocols and also have support some of the more complex, complicated operations. And we found the bugs on VTX. There were some of the leakage in it. Later, we are going to provide the PR to the officially. And next is about the big family. We are going to link the EDCD to Reddix. And as for the upgrade of the VT gate, we are going to utilize its characteristics to do some of the unawared upgrades on the client side. Because we have provided different kinds of accounts like RWR or RS, so some of them can read directly from the warehouse so it can separate the readings. So that it will avoid some of the problems happened to the gateways. And also we will provide specific VT gates to some of the big amount, big demand nodes in the clusters. We also have very detailed monitoring. And we also want to do some more convenient operations on reshutting and do the scheduling more intelligently and migrate from 2.0 to 3.0 because originally working on 2.0 and 3.0 able to satisfy more demands. Just now, we have talked about monitoring everything. So we want to include the VT gate both tablets in my circle, etc. so that we can provide better detailed and accurate demonstration, etc. to the users. And for the original UI and for the DBA, we need to simplify some of the operations. As for intelligence scheduling, it's very interesting for us because we're using some of the machine learning algorithms to include some of the resources needed by the users, including some CPU to do some scalability work so that it can save some of the costs of our company. And later, we're going to migrate from 2.0 to 3.0. So this is the work we need to do. Today, we are very honored to come to share with you together with my colleague. And during the utilization of VTECs in JD.com, we have a lot of help from our team and also from our friends as well. And I also want to thank the VTEC team. Thank you for all your support. Next, we're going to hear more speeches from the VTEC team and their time. And they are on SE16, on the booth at SE16. And you can meet with us there.