 Hello, this is Matej from Consensus Lab in Protocol Labs, and today I want to introduce you to Mir, our open source framework for implementing distributed systems. Mir is really implementing distributed systems the easy way. Today, I'll be just talking about it on a high level. I give a short introduction, namely what it is and how it works. But first, let's look at what we are actually dealing with and what the background is. And I think it's important to recall what the model is that we're considering when implementing distributed systems. So I very quickly recall what a node is, what a distributed system is for our purposes, what an algorithm and distributed algorithm is, what a distributed abstraction is, and how it is usually implemented. And then I will tell you how Mir comes into play and how it makes it easy to implement these things. All right, so first, just to be on the same page, what is a node? A node is some entity that has some internal state. It can perform computation on that internal state and transform it. And it can also communicate with other nodes using message passing, so through sending and receiving messages. A distributed system is just a collection of nodes that can communicate through some communication network, again, just through sending messages to each other. All right, so what is a distributed algorithm? A distributed algorithm is just a collection of algorithms that these nodes execute. Very often, each node executes exactly the same algorithm as all the other nodes, but this is not necessarily always the case. So generally, it can be different, but that's not really the point now. And we just know that a distributed algorithm is a collection of algorithms executed by all the nodes of a distributed system. Now we are going to focus on a single node and how it is executing its algorithm. And always keep in mind that all the other nodes have their instances of their algorithms and they're doing the same thing. But we are looking at it now from the point of view of one node. Okay, so what is the algorithm the node executes? Well, in general, an algorithm is just sequence of steps a machine can execute, at least for the purpose of this video. In general, it could be something like get one input from the user, get another input, add those inputs, store them on disk, whatever, while some condition is true, then you do something and it can send a message afterwards to some IP address, whatever. So this is just a sequence of steps the node executes. Now in a distributed setting, it is very useful to think about the algorithm from a slightly different point of view or in a slightly different way. We remember that the node is part of a bigger distributed system and it receives inputs all the time from that system and from the outside world or from the user. So what is very useful is to break this algorithm apart in some abstractions that interact with each other and that represent, well, the system we are studying and the model which we are considering. So that's why distributed systems or distributed algorithms are very often described in terms of distributed abstractions. A distributed abstraction is some kind of a black box that consumes events and produces events. Each node usually has a local instance of such a distributed abstraction and each node feeds events to the abstraction and consumes events that the abstraction produces. For example, if we have two nodes and they have a link abstraction, then the link might consume send message events and might produce message received events and the contract it might provide is that if one node invokes a send message event on the abstraction, on the link abstraction, then eventually the message received event will be triggered by the abstraction at the other node. So each abstraction always has events that it consumes and events that it produces. Slightly more well-known or more common abstraction, well, not necessarily more common but the abstraction we are very interested in is also the consensus abstraction. It can also be modeled with just two events, propose and decide. It consumes a propose event and then it promises that it will trigger a decide event at all the nodes eventually and that the value will be the same and so on. That's also not really the point now. All right. Disability abstractions can be also combined in a way that they use each other's events, like one abstraction triggers an event that another abstraction consumes and this can go arbitrarily deep. So for example, if we have a consensus abstraction, we can see it as some black box that executes some algorithm that consumes propose events and triggers decide events but the inside of the abstraction basically needs to do something meaningful when it receives the propose event and it needs to know when to trigger the decide event and with what value. So the implementation of an abstraction in general just needs to dictate how to react to a propose event and when to trigger the decide event and the reactions to events can also be triggering more events. For example, in this case, when a propose event occurs at the consensus abstraction, it might trigger one send message event to tell the other node what the propose value is and it can trigger another event setting some time out in case the other node doesn't respond. And then it knows what needs to happen when a time out is triggered and when the message is received. For example, when the message is received, it changes some internal state. If the other node, let's say, agrees with the value, it can trigger the decide event and then in this state will ignore the time out event if it occurs, for example. So the bottom line is that the implementation of the abstraction, the algorithm that it executes basically just describes reactions to events that come from the outside and conditions under which new events are triggered. So this is how it's very typically expressed in pseudocode. The implementation of an abstraction very often called distributed algorithm, which is nothing more than an implementation of such an abstraction, is expressed in little blocks that describe what has to happen inside the abstraction when different events occur. For example, an abstraction implementation could react to any event to a message received event and to a time out event by respectively setting some contrast to zero and triggering some ready event when initialized. When it receives a message, it can check whether the message is correct or consistent with its state. It could update the local state accordingly and trigger maybe some other message events depending on what the message was. And for example, on a time out event coming from some timer, it could just trigger abort which would maybe be consumed by some other abstraction again. So all these red mark things are actually events. So we are consuming events here in the upon lines and we are producing events in the trigger lines. And this is really the way the distributed algorithms are most commonly specified. All right, so now we have the distributed algorithm mindset and let's look at how MIR comes into the game. So MIR is a tool for expressing these distributed algorithms in the Go language. It is a public open source project that you can find on GitHub. The slides will be published along this video and with a clickable link on it on them. And MIR is basically a framework, it's a library that implements a framework such that a process that is using the library runs on each node of the distributed system. And the MIR library executes the local steps of the specified algorithm. So basically the programmer who wants to implement a distributed algorithm can instantiate MIR, can define within the MIR framework this algorithm that they want to be executed and MIR will just execute that. So how does it work in practice? Let's look at how we model these abstractions. MIR provides the abstraction of a module, which is exactly one to one depiction of the abstraction in a distributed system. It is some entity, some black box actually from outside that just consumes events and produces events. We need to know nothing else about it. And of course it executes some algorithm that the programmer can specify. So to be more concrete, imagine we have the pseudocode from before. We have the init event and the message receive event. Then in MIR the implementation would look like the code on the right. We have for each event that we need to react to. We write the function, apply that event, like apply init or apply message received. And the body of the function just specified what needs to happen when the event is triggered. And here we see we can just set some internal counter to zero on the init event on the internal state of the module. So the module is basically a go struct. It's an object that can have properties that store its state. And this can be modified through the event handlers. The events triggered by the abstraction or by the implementation, they are returned from the corresponding handler functions. So whatever the handler function returns is an event that is considered to be triggered by that module. We see that in the function signature that all these event handlers, they return apart from an error a list of events because they might trigger more than one event. And when the computation is done, we return a list of events. In this case, only one event, namely the ready event that will consider to be triggered and MIR will route this event to the appropriate module. For the message received, it's the analogous thing. All right. So here I was talking about what we call a passive module in MIR, which really just transforms input events into output events and possibly updating the internal state. But it doesn't really do anything on its own without being triggered from the outside by some event. So it never creates events out of the blue. The passive module couldn't just tell the system, hey, a message has been received because it needs to create this message received event in the first place. So this is good for, for example, the protocol logic where we only really need to specify some state and some transformations on the state and so on. Now, to communicate with the outside world, like for sending and receiving, sorry, for receiving messages or for receiving timeouts from the operating system, we provide what is the so-called active module that can produce events even without being triggered. Technically, it is solved slightly differently. In Go, it actually exposes a channel. And whenever there's something of interest, it just writes that event to the channel. And then the MIIR framework reads that channel and routes these events to the appropriate places. You can have a look at the details of this in the documentation, which is also linked here. So if you download the slides or if you look at the slides, you can click on that and have a closer look. All right. So now we have the modules and we know how to describe an algorithm that needs to be executed. And really, as you see, it's very close to the protocol as described in the pseudocode. There is some more boilerplate code when we are instantiating these events and defining the modules and their state. But the vast majority of it is actually generated by the tools that come with the MIIR framework. All right. So we have the modules. So what do we do with them? We need to somehow have the whole thing run. So the main abstraction provided by the MIIR library is the MIIR node. A node represents a node in the distributed system. And it's basically a collection of modules that orchestrates those modules and routes events between the modules. So for example, if we have the example from before, we have some consensus modules, some link modules, some timer modules, the consensus module has an apply-propose method that would return to events, namely set timeout and send message event. And then the MIIR node implementation would take these events, look up the timer and the link event, and call the appropriate functions on those. How does it work in practice? In practice, the main function of some program that the programmer writes, at some point will contain the MIIR.NewNode call. And the MIIR.NewNode just creates a new instance of a distributed system node. So let's go through the arguments first. And this is, again, just an example. So when we instantiate a node, we just need to tell it a bunch of stuff. We tell it its own ID. We tell it some configuration parameters, like where it should write the logging output and so on. And the most important part is which modules it should use. Each module has a name. In this case, it's the string on the left. It's the app, the protocol, the net, or the crypto module. And on the right, we are actually variables that we need to have populated with instances of the corresponding modules. So, for example, you can have a user application module that prints output to the user. You can have a protocol logic that just decides how to react to messages and what states to keep on the protocol level. The network transport module is one that actually takes care of establishing network connections and sending and receiving messages. That's it needs to be an active module. And we can have a crypto also as a separate module when we want to have a modular implementation of cryptographic operations. For example, when the protocol module needs to sign something, it would emit a signature request event that the crypto module consumes. The crypto module will compute the signature and trigger an event that the signature has been computed and the protocol can continue its operation. Another advantage of this is that each of these modules is executed. The logic of each of these modules is executed in a separate go routine and we can parallelize a lot. And while keeping the logic of each module sequential, making it much easier to reason about. There are two more, two more parameters that I haven't mentioned yet. It's a right ahead log and an interceptor, a right ahead log. It's both of them are optional. If we use nil, they're just not used. Right ahead log is just a persistent right ahead log that can be attached to a node such that important events can be actually persisted to disk and this will help the node recover from crashes. And the interceptor is also a very useful component that can be used for debugging. So actually what we have is that the node that looks like here on the picture, it has several modules that produce events. They're stored in an internal event buffer and the dispatcher dispatches these events to the appropriate modules again. Now before the events are being dispatched, we can actually intercept them with the interceptor and store them somewhere outside of the node and later we can inspect them with a debugger or we can even instantiate a new instance of the node and inject those events one by one to have a very close look at what's happening. The node also has two more functions, run and stop with the obvious meanings. After instantiation, we need to call run on the node such that all this machinery starts moving that we spawn the thread that is distributing the events and collecting them from the modules and so on. And at the end, we just call stop, which stops everything. All right. The very first introduction to the MIR system, please go check it out at GitHub and the next time we'll be building a sample application from scratch using the MIR framework. So this will be actually a coding video where I'll start from scratch and the step-by-step create a sample application so you can see how actually this works and how we can use it as well. Thank you very much for watching. Bye. Bye.