 Let's have a look at the title of the talk. It says building modular and scalable consensus using the mirror framework. So there are some important keywords here, modular, scalable, and consensus. So especially consensus is what a lot of these talks revolve around. And my slides don't change, now they do. So first, let's think a moment about what ecosystem we are in here. We have a really big diversity in agreement protocols and their specifications. All the protocols might implement some slightly different guarantees, and there's distributed protocols that are literally appearing every week, and some of them are faster, some of them have higher throughput, some of them have lower latency. It's a really, really dynamically evolving ecosystem of these protocols. But which protocol is there to stay? Which is the one that will not be immediately replaced by some other newer version of some protocol? Well, one way to look at it is the way already Charles Darwin looked at consensus protocols, namely that it is not the highest throughput consensus protocol that survives, nor the one with the lowest latency, it's the one that is most adaptable to changing requirements. And he was a very smart guy. And fun fact, he actually did not say that. He was not Charles Darwin who said that. It's a commonly believed myth. It's actually Leon Megginsen, a professor of marketing in Louisiana who said it in some of his lectures like in 63. All right, but let's get back to actual consensus protocols. And this will be a quite technical talk as compared to the previous one, which is more visionary and involving the whole galaxy, or at least the solar system, let's say. And here we will focus on the computer. So first, I will say why actually consensus is by far not enough. What we actually want is state machine replication. And state machine replication is a different problem than consensus, and it involves many other sub-problems that need to be solved and that need to be researched, and that are actually being researched separately and implemented separately. Then I will show you our approach to modular state machine replication, which is a high-level architecture of a system that implements state machine replication with well-defined modules that can be implemented separately and especially that can evolve separately and scale separately. And then I will have a few slides about how, in practice, we actually implement this in Go quickly with reusable modules and how we can debug it easily. All right, so let's look at why consensus is not enough. So what is consensus first? Very quick, very quick recap. But consensus is a problem, the solution to which takes a value and outputs a value. And the nice property of consensus is that if multiple nodes use the solution to this problem, then basically everybody will eventually agree on some value that is not completely arbitrary. So this is what is consensus. So it is important. It is actually a hard problem to solve very often. But what we want to know is who won the distributed auction and how much do they pay to whom. This is what we want to know in the end. Consensus tells us 42. It tells everybody 42 eventually, but it doesn't really help us too much. So what we do in order to order, for example, the bids in a distributed auction, we need to solve total order broadcast or atomic broadcast, which is not the same as consensus. It is, in a way, equivalent to consensus, namely in a way that is very well-defined and that says it is equivalent because it can be implemented using infinitely many instances of consensus, blah, blah, blah, blah. But it's not the same thing. So we can order values. But who won the distributed auction and how much money do they pay to whom? What we can get from total order broadcast is a totally ordered bunch of values. That's nice. We might decide which value goes first, but it doesn't really tell us what we want to know. So we actually take the total order broadcast and use it in order to implement what we call state mission replication, which, again, not the same thing. And we feed a sequence of values into a node that has some state that can perform execution of transactions that can apply these transactions to the state that can communicate with other nodes that replicate the same state machine. There might be some clients that talk to it and so on. And more things that we need to care about is to garbage collect maybe some old state and transfer state to somebody who might have fallen behind and so on. And I didn't even start talking about the problem of, for example, disseminating transaction payloads to the nodes that need to execute it. So who won the distributed auction and how much did they pay to whom? Well, now we actually agreed on some values. We ordered them. We put them in some state. We interpreted the state. We performed some execution. And the result of the execution can be interpreted in some application-specific way. And then the client can actually ask node and say, OK, I won the auction, and I pay 42 coins to Alfonso. So this is state mission replication. And this is what we need to do in the end. But this is not the same as consensus. Consensus is just what you see. So we need to do data dissemination, we need to do some execution, we need to do a lot of stuff. And all these problems are non-trivial. All of them are being researched and tackled separately. And we need to combine the solution to all of these to get what we want. And that's very complex. So bottom line is that consensus is not enough. What we actually need is state mission replication. And in order to get that, we need to solve many sub-problems that are very often already well defined by the scientific community and that are being researched separately and that have interesting solutions and different kinds of solutions. And we need to just solve all of them to get what we want. So now I'll continue by showing very high-level architecture of how our concrete approach is, what our concrete approach is to solving these problems. And we are building an algorithm that we call Trantor. And it's a modular SMR system with modules that can be implemented separately, that can evolve separately, that can scale separately. And putting all of these together, we get our state mission replication. And this is what we are going to be putting in the Filecoin subnets in the interplanetary consensus that both Alfonso and Juan were already mentioning. So let's look what Trantor is. So it's a modular state mission replication implementation. Each sub-problem that I was mentioning before is addressed by a separate component that we call module. And each of these modules is a well-defined entity that has a well-defined interface that can produce and that consumes events and the implementation of which can be created separately from the other modules. And if we see that in our system, one module is actually the bottleneck for throughput of transactions, we can actually zoom into that module and see, OK, why doesn't this work? Why doesn't this scale? Let's find some other, better algorithm. Let's tune our implementation and plug it back into the system when we have improved it and when some other part will become the bottleneck. Now, we were also talking about different Filecoin subnets tuned to different applications that are supposed to run in very different environments. There is one subnet that is orchestrating everything that happens on a planet. And there's one subnet that is running just within a data center. Well, these are very different requirements. And the consensus protocol will probably need different choices and different things to optimize in order to work properly. So what our vision is, why did it switch the slides? OK. So our vision is that we take Trentor and we just can create easily different flavors of it to really tune it to different deployment scenarios with different requirements. This all work in progress. It's being developed literally every week. There's some pull requests on our GitHub page. And it's all work in progress. So how does it work? On a very high level, the architecture of Trentor is that first we need to get some transactions from clients or from smart contracts or somehow we need some abstraction. That holds the transactions that are waiting to be executed. That is the mempool. The mempool feeds the transactions to what we call the availability layer, which takes care and only focuses on the dissemination of the payload of the transactions to the whole distributed system, or at least to a majority of the participants. This availability layer module, it produces batches of transactions and availability certificates. That means that when this module says the batch of transactions is disseminated to sufficiently many computers, such that we can consider it available, it produces a verifiable certificate that it is the case. Now, we store these batches and the certificates in a structured batch store. It can be just a dummy key value store without anything else. It can be smarter optimizing other things that can be used later for optimizing the consensus protocol or fetching that is open and that is supposed to be this flexible. Then we only take the availability certificates, which are small compared to all the payloads of the transactions, and we order those. In the past, there have been many consensus protocols or many total order broadcast protocols that were handling all the payload of the transactions. That was becoming the bottleneck. It was very complex and complicated. And many trade-offs needed to be made. Here, we only focus on ordering small pieces of data that represent batches of transactions, which are the availability certificates. And here, we have a sequence of availability certificates coming out of the ordering component. And then, we take those and we assemble the batches that are referenced by these certificates. And that gives us a consistent, totally ordered sequence of transaction batches. And we give it to some execution module. And again, the execution module can be and often is actually the bottleneck in some state mission replication systems. Well, here, we can focus on performing the execution efficiently, depending on what kind of machines we are using, what kind of processes, what kind of workloads we are expecting. We can paralyze it, and we encapsulate this in one module. And we can focus on optimizing that and scaling that. So this was the high-level view of Trentor. To summarize it, it uses modules with well-defined interfaces that can be implemented separately, that can evolve separately, and that can scale separately. And this is what we are going to be using in our Interplanetary Consensus Implementation. And now, the last part of my talk is about very practical things, and namely about how we actually implement this in code. So for this, we use Mir. And Mir is a framework that allows for fast development of distributed protocols with reusable modules and with a powerful debugging mechanism. So Mir, really, is just a tool for expressing distributed algorithms in Go, because Go is the language we're using now. We are planning to extend this to multiple languages as well in the future. It shouldn't be actually that hard to do. And it is a library. And each node instantiates the library in a process that runs on the machine, and it executes the local steps of the specified algorithm. It is an open-source project. You can go check it out at GitHub, and you will see some documentation, some code. And we are very happy if you post us any questions there, or even collaboration is very welcome. So how does Mir work? Well, we see all these modules in our distributed systems, all these components that produce and consume events. And this is exactly how Mir works as well in terms of code. So the modules of the system are represented as objects in the Mir framework. And they consume events, and they produce events. And the implementation of the module is basically just a description, an algorithm, that says what to do when an event is received, and how and when to produce more events. To give you a concrete example, we have this pseudocode on the left that is a very typical way of how distributed protocols are being defined in the scientific literature. Like you have some event, and when that event occurs, it describes what you do. And Mir mirrors this just in go code. So for example, here you see that upon the initialization event, you set some counters to zero. Maybe do some other stuff, and it triggers some other event. For that, you need to only write the implementation of the applyInit function that sets the module counterstate to zero. And it returns a list of events, in this case, only single event, which is a ready event. For example, the same goes for message received. You just implement the function that receives the message and the source of the message as an argument. And then you can, for example, check the message, ignore it if it's bad, and return some list of events that the reception of this message triggered. So this is the implementation of the single module. You can take pseudocode, rewrite it very similarly in the go programming language, and you have the module. Now you can take this module that you implemented and put it together with other implementations of other modules that need to be part of the whole system and put them together in a node. A node is really the mirror frameworks abstraction that models a distributed system node. Somewhere in your code, in normal simple applications, probably in your main function or somewhere, you just call mirror.new node. You give it a bunch of configuration parameters like what is the own ID of the node, where it should write the logging output, and that's basically it. And then, most importantly, you only give it the list of modules that should be instantiated and work together. And the node, when you call run on it, it will just take the events that are output by these modules, buffer them in an internal buffer, and dispatch them to the appropriate modules that should consume them again. And many algorithms are very, very easy to actually express this way. So when we create the node, you see that there are two more parameters that we can give it that are rather technical. But since they are pretty cool, I'm going to give you a little glimpse of how, for example, the interceptor parameter can be used. So when you have such a node and it is processing events, you see that the events are taken from the buffer and dispatched in that you only see one error. And that is there is actually a purpose behind it, because all these events are totally ordered. It's a sequence of events that is being processed. So what you can actually do is to attach an interceptor to the node. It would latch to this event dispatcher. And all the nodes can be, for example, stored on disk, extracted from the node, inspected. You can have a look at them. You can see what's happening in the node. You can even instantiate another instance of the whole node and inject the events. And one by one, look at what's happening in the node when you're debugging it. You can even save the trace of the events. And you can instrument your code. You can look at the, you can change the implementation and then replay the sequence of events to see what was happening and what was wrong with your reader code. All right, so to summarize, today, I was talking about three things. One is that consensus is not enough. What we actually need is state machine replication. And state machine replication needs to solve many, many sub-problems that are very often being researched and tackled separately. Then I showed you Trentor, state machine replication system that is modular, that uses modules with well-defined interfaces that are implemented separately, that can evolve separately, that can scale separately, that can be debugged separately, and so on. And this system is what we will be using as a protocol to power the subnets of interplanetary consensus. And then at the end, I showed you Mir, which is a framework for implementing distributed protocols in the Go language. And it enables fast development using of modules in multiple compositions and a very nice way to debug your code. So that's all from me. And if you have any questions, please raise your hand. Thank you. Quick question. Do modules, do they run on a single trail or do they run on parallel? No, they're running parallel. They're running parallel. Yes. Every module is run by a single thread, and then the synchronization is done on the node level. OK. Thank you. You mentioned that there is support planned for other languages. Which languages and where in the process does each planned language support stand? Well, the support for other languages is not implemented now. It's just something that we are thinking of adding later. But the idea is that a module could be basically implemented in a different language than Go. And the runtime could still be in Go. And since the interface of a module is super simple, it just consumes events and it produces events. You can, for example, have a Rust implementation of a module with a little wrapper around it that communicates with the node and with the Go runtime through IPC or even over the network. And why would we like to have other language support? Well, because there's many people working with different programming languages. And it's very nice if you can implement your module and your protocol in whatever language you're most familiar with. Now, literally, everybody whom I was talking about programming says that, oh, Rust is so great. And I want to code in Rust. So giving these people the opportunity and the possibility to code their modules in Rust would increase the base of the people that can use it by a lot, I guess. So conceptually, is a module a separate process? Or is it just something that you're communicating with over FFI? Sorry, sorry. It conceptually is a module like a separate process? Or is it something that you're communicating with over, like calling out to over FFI, or? So currently, the module is an object that implements some logic. And there's a thread handling incoming events to the module and producing events that the module's logic outputs. And currently, it's all running in one process. And there's one thread per module. But we can have, for example, in our runtime, a generic, let's say, module that could be called the remote module, which would just take all the input and dump it to the network. And there could be some other little server that is implemented in some different language, for example, Rust, or whatever other language you want. And execute the logic, send over the network the resulting event back. And then this client module or this remote module that is running in our runtime would just translate back to go and return those events. And I mean to say, so I do my application, right? I set it with a defined set of modules, right? And next, on a different step of the application, I want to change some of those modules. Is this possible in Mir? Or do you have a plan on how to make this possible? That's a very nice feature that we were actually also needed. We don't support it currently natively at the level of single modules. But we have a very easy workaround for that. Namely, we implemented what we call the factory module. So basically, it is a module that is running modules inside. So the module can receive an event. Hey, create a new submodule that acts as a normal module. And this factory module just acts as a proxy. And you can create and destroy submodules within a module. Now, I said one module is running one thread. That's the default case. We also have a way to spawn multiple threads within a module and parallelize the execution within a module. So in fact, this factory module wouldn't even suffer from the single threadedness. It can have multiple submodules, each run by a separate thread. OK, thank you. All right, fantastic. Thank you so much, Mathe. Thank you.