 OK, can you hear me clearly? Oh, echo. Thank you, very glad to see you here. And it's not easy that I have a long journey flying from China to San Francisco, and then from San Francisco to Berlin. So very glad to see you here. And today I will talk about how we're using raft in raft. Does anybody here know the raft? Oh, great. Amazing, too many. And before I start, let me introduce myself. My name is Liu Tang. You have called me Sidon, and a law and chief engineer of pink care. And we have been developing a next generation hybridized, transactional, and electric database. We call it tidy B. Maybe someone knows it. And then based on the tidy B, we have built a distributed transactional K-value database. We call it tidy K-V, and it's written in raft. And in my spare time, I'm also an open source lover, and I have developed some open sources like raft problem issues, and raft IS, and which I will talk about today, the raft IS, and the jrpcis, and the gold bicycle, and the next DB, is a cycle. That's all. Today's agenda for this talk. I have four parts. At first, I will talk about why we needed to use raft. And then I will give a brief introduction about raft. And then I will show you how we use raft in the raft. Sorry. And in the end, I will talk about something about our product, tidy K-V. Let's begin. The first is why. Why we use raft? And assume that everybody here, you want to build a database or a storage server. So maybe the first simple thing is that you use one load, and here, let me use my circle, or post a grid circle, and that's it. And the client write data to this load, and the client read the data from this load. And it works well. But as you can see, the load is only one load, and it has a single point. So if the load is crashed, and the service is unavailable, horrible. And furthermore, if the load can't be rock recoverable, so you lose all your data, it's kind of acceptable in some database. So how can we solve this problem? And mostly, here, we need replication. And as you can see here, we add a new load, and we call the old load the master, and the new load the slave, like the common myself dot. And we mostly, for better performance, we use a synchronized replication. And when the client write data to the master load, and the master reply to the client, OK, the data is saved. And then the master will replicate the data from to the slave asynchronously. This works well. In most of my circle architecture, we use this replication mode. But as you can see there, a single replication has a problem, because the replication is a synchronous. So sometimes, the slave can't have the latest data. If the master is down, and the slave is promoted to the new master, and we found that the slave can't have the latest data. So as this one, the slave has A and B, but we lost it. And this is still some acceptable in some critical database scenarios. So how can we solve this problem? And here, mostly, we use a synchronized replication. Unlike a synchronized replication, use a synchronized. And the master must guarantee that the data has been replicated to the slave. Then the master will reply to the client that the data is saved. So using this way, we can guarantee that our data is safe. Even the master is crashed. The slave is promoted to a new master, and we can know that we have already the latest data. We can't lose our data. But there is a lot of problem, is that, as you can see, we have only two loads. And if the master is down, the slave is promoted to a new master, and there is still one load when we meet a single load, single point problem again. So how can we solve this problem? And mostly, we can't use only two loads. We need multi loads. And mostly, if we want one load to be available, we still need three loads, maybe the column. And as you can see that, for one master, we need two slaves. And here, because, as I said before that, we use a synchronized replication. But the data has a trade-off here because for the better performance and high availability, because using synchronized replication can reduce performance. So sometimes here, we use the column replication, which means that if the master found that the majority of the servers have already replicated the same data, it can think that the data has to be safe. The data is safe, and we can think that the data is consistency. As you can see that, the master just only replicates to the data to the one slave, and he thinks that the data has to be saved in two loads. And he thinks that the data is safe. So, and another problem comes is that when the master, because now we have three replicates, or even more we have more replicates, and now it's the master exam, which one, which slave do we need to promote to the new master? That figure obviously, and we need to promote the slave, which has the newest data. But how can we know that the slave has the newest data? So this is one problem I left here. And the other problem is that because now we have smart replicates, and we are in a distributed system, because we are a distributed system, and each load is commuted with other loads through the network. Because we are a network, and the network can be broken. So we will meet some scenario like the brings me lead. And here you can see that the old master is isolated from the other customer, sorry, the old master is isolated, and the new master is promoted. So here we exceed two master, one old slave master and a new master. And at this time, I lucky some client, still client to the old master. So we meet a problem that can this client still read the data from this master? Yes or no? And assume that now we have a key, we name A with value one in the master. So if the client read A with value one, it's OK. But then later, if we write a new value two to this key A, and then if the client read the data from the old master, still read A with value one, it can't be accessible, because it breaks the data consistency. And because we write a new data, but the client read a slave data. And it's acceptable in some critical database scenarios. So how can we solve this problem? As you can see, I list many problems from the one node and to the multi-load. And luckily, we have the consensus algorithm. And the consensus algorithm can help us to solve this problem, as I said above. And in the current world, there are two popular consensus algorithm. One is Pexos, and the other is Raft. And here, I only talk about Raft, because I think that Raft is more simple and more easy than Pexos, and I even think that it's more easy to use in production. And I even think that you can use our product, the Raft library, in the production, soon, later. So Raft is easy, I think. And if you want to master Raft, I think that you only need to know two points. The one is how the Raft elect the leader. And the other thing is that how the Raft do the log replication. And I even think that if you load these two keys and you have already mastered the Raft. Oh, let's begin. The first is that let's talk about election. And for the Raft cluster, every load has three rows, the leader, the follower, and the candidate. And every Raft cluster has only one leader. And the leader is elected by the majority of the PRs. And if the leader is elected, and only the leader can handle the client, can only handle the client-client, no more, no more, sorry. The leader can only, not only the Raft, but also the Raft must be through the leader. So if, so, at this time, because we have only one leader, so we don't need to miss the brings the problem, as I said before, because only the Raft and the Raft are only through the leader. And if the leader is elected, what other PRs are the followers? The follower will do nothing but just receive a message from the leader and do the log replication. And even if the follower doesn't receive any messages from the leader for a long time, the leader will become to the candidate and begin to elect a new leader. This is a whole picture of the election. At first, all the PRs are the followers. And after the election timeout, the follower will become the candidate. The candidate will vote by itself. And the candidate will send votes to other PRs. When the candidate found that, it received a vote from the majority of the servers, and it can become the leader. Of course, the candidate found that there has someone who will become the leader and either will be sent back to the follower. Okay. When the leader is elected, and the leader only to handle the candidate request. When, for example, here, the candidate will write some data to the leader. And the leader will use his consensus model. We need here is a rough algorithm and to replicate this data into other followers. And here, each load will append a command to its own rough logs. When the leader found that the majority of the server has received this rough log, and you can think that the rough log is committed. After the rough log is committed, it can apply this rough log and apply this to the state machine. And this is a common replicated state machine with the replication log. And this is the core concept of the rough. And as I said before, and here, but there are many things I don't mention here about the rough, like term, like lecture, and how the membership change about how rough do the load and the removal though, and how the rough do the preload. If you really want to master the rough, and use your load this concept. And of course, you need to do many optimizations to for the rough, and use it in your production. Like use the pipeline and the batch to speed up the network transportation. And use the learner to let the, make the membership change more stable. And luckily, the world optimization at least above has all been supported in our rough library. So here, I will talk about the rough library here. And you can see the repo here, and you can use it directly from the crate. And our rough library is inspired by the HCD rough. Because each series rough has been used in production for a long time. And the many popular projects like Qublative, like co-wrote DB have already used it in production. So it's a good start for us to inspire from the HCD rough. And the rough library is a tiny library, and it's very light. You can embed it into application easily to provide it a consensus layer. And the rough library is only focused on the consensus algorithm. So you should need to consider how to save the rough log, and how to apply the rough log, and save it into the state machine, and how to communicate with other rough load by yourself. And using the rough library is very easy, and you can see the whole picture here. And at first, you need to create a rough load, like, okay, I was lucky there, I saw an error. You need to create a rough load. And when we create a rough load, and you can see that every rough load is a state machine. And what we need to do is to drive the state machine from this state to other state. And mostly, we have three ways to drive it. One is for tick, and which means that we need to drive this rough load regularly. For example, every 100 millisecond, we call the tick, tick, tick, tick to drive the rough load. And another way is that the client will send the request to the rough load to the leader explicitly, and we call this propose. We use propose to drive the rough load. And here is a step, sometime because the rough load will receive the message from other rough load, and we call here, we use step to drive the rough load. After we drive the rough load, and the rough load will move to some state, and sometime it's the rough load will move to a ready state. And in the ready state, we can do something. In the ready, we can get an entry from the ready and apply to the rough load, and we also can get committed entries and apply to the state machine, and we can also get some messages and send them to the removed rough loads. And after we handle all the ready, finish all the ready, and we are called advance to drive the, to advance the rough load to another step, to allow state and the lack to again. So, when you want to create a rough load, the first thing that you need to create a rough storage, and it's very simple, and you can see that there are only six functions, and the first initial state return the initialized state of the rough load. The rough state include the current term, the current committed index, and the current loader. And entries return a slice of the rough load from the load to the high, and the term return the current load, and the first index and the last index return the first and the last index of the whole rough logs. And the snapshot return the snapshot of the current state machine. You can think that you take a picture of the state machine at this time. And then you can create a rough load and with your storage and with your configuration. Here you can see that the configuration is very simple, and even each rough load has a unique ID. And then here is the election tick and the hotbed tick. You may see, I remember that I mentioned the tick before. So, for every tick, for example, the election tick is for the follower. So, when we tick 10 times, the follower, if the follower doesn't receive any message from the leader, and either we are sent, begun to the candidate and begun to the election again. The hotbed means that when the tick's three times and the leader will send the hotbed to the other followers. And so, after we create a rough load, we can do something. Here you can see, as I said before, the tick drives rough load regularly and you step to receive the remote message from the other load and just drive it and use the proposed to receive the command from the client and the drive it to. When we drive the rough load and it may be enter the ready state, we can call the load has ready function to check whether the rough load is ready or not. And when the rough load is ready, we can get this ready. And here, if you can see this entries and we can append this entry to the rough logs and here is message. We can send the message to the other remote load and here is committed the rough logs and we can apply this to the state machine. After we finish them all, we can call the advance. And I don't mention some, but in the ready, we also need to handle the slap job or handle the leader change, term change or commit the index change, but I don't mention, I don't use here in this talk. So, as you can see, using the rough library is very simple and I even think that you can embed it into our application to provide a consensus database service easily. And we have always do it, and that's the TechKV. TechKV is a distributed transactional key value database and it's based on the rough library and you can, we say the repo here and we use the rough DB to save the rough logs and to save the state machine data and we'll use the GRPC to communicate to do the network communication and even more. Now, do you know the CNCF? Anybody know here? Oh, now TechKV is the CNCF sandbox project, but it has a lot to be around, maybe it will be around in the end of this month and here I just play around with it. And now, maybe TechKV, I even think that TechKV may be the first rough database storage, database project in the CNCF. You written by rough, the first written by rough database. So, and our ambition and so if you have paid attention to us, we welcome to join us to develop TechKV. So, that's all.