 Hi, everyone, I'm Martin Buzy. I'm going to talk about methods for deterministically paralyzing message processing, which sounds like a really fun title. So I'm really going to talk about Primia. What Primia is is sort of a prototype of me experimenting with this idea of deterministically paralyzing message processing. So an overview of the talk. I'm going to talk about the history of eWASM kernel and Primia, problems with current blockchain computers, and hopefully potential solutions. So the roots of Primia really started in eWASM days. And it builds extensively off the ideas used in eWASM. So if you're familiar with eWASM, the idea was to adopt the WebAssembly instruction set for usage in the Ethereum blockchain. Another important part of this was to add metering to WebAssembly. To run this modified WebAssembly binary, we created something called eWASM kernel, which just provided the methods to link the Ethereum blockchain to the WebAssembly binary. And shortly after that, I used the eWASM kernel to prototype something called nested contracts, which I'll talk about in a bit. After that, it was sort of a branching point. So WebAssembly solved several problems. It was more performant than eVM and more extensible and had a larger ecosystem. You could compile C or C++ or any LLVM compatible language to WebAssembly. But thinking about starting from the virtual machine layer and going up, the next performance bottleneck was in parallelization. Currently, the way things are structured in Ethereum, we can't run transactions or messages concurrently. So Premiere was focused on prototyping ideas with concurrency. And it moved towards using the actor model, which I'll talk about more in a bit. So a brief overview of what Premiere is. It provides inter-process communication for contracts. It has a microkernel-like design. I used that term fairly loosely, though. It's concurrent. It's deterministic. There's a JavaScript prototype on GitHub. And currently, it's being developed in conjunction with DFINITY. And hopefully, we'll be useful in Ethereum once we have sharding. So Premiere, you can think of it as the layer that sits above the virtual machine instruction set, but below consensus. And it's totally decoupled from both. It can support heterogeneous instruction sets, too. So in other words, it's totally agnostic to an instruction set. You can even mix and match instruction sets, if you like. You'd run the WebAssembly, JavaScript, or Purex86. And ideally, Premiere could provide inter-process communication across those heterogeneous instruction sets. So blockchain computers. According to Google Images search, this is what a blockchain computer looks like. It's like a star-going supernova with quantum entanglement or something. But you could also make an argument a blockchain computer currently looks like this, a fairly early computer, or maybe an ID cell phone. A few problems that we have right now are it's not very performant compared to a server. We haven't really accomplished scalability yet. The design is still fairly ad hoc. It doesn't borrow much from what was learned from previous designs of operating systems, for example. It's hard to extend. And because of these, it's also hard to interpolate with existing systems. So if I have a program running on a server, it's fairly hard just to move that program to running on a blockchain computer. You have to rewrite it. So scalability. So scalability is a really hard subject, and I don't have any solutions. But what I want to try to accomplish is to observe some properties that scalability might have and how to express these properties in inter-process communication. So I think a fundamental property we'll see in scalability is a need to impose locality. So what do I mean by locality? In the physical world, if we have two objects separated by some distance that are communicating, locality is sensed by the time communication takes to happen between those objects. And the time it takes is going to be how far apart they are times the speed at which the communication happens. Speed here is kinetic energy, essentially the mass of the object times the force that was used to push it. So in the world of incentivized computation, we can use this as an analogy. In the world of incentivized computation, a mass of an object is the size of it, and bytes or bits. And how much force we push it with, you could think of it in terms of maybe gas price. So how could we oppose locality on contracts? Essentially, what that means here is that contracts further away will sense distances via longer communication times or higher costs or both. So currently, Ethereum has a flat namespace for contracts. All contracts exist along a single tree, if you will. My first idea for imposing locality was the idea of subcontracts. The idea would be that each contract could create subcontracts that it only had access to. Now, this idea has a few problems, but let's explore it a little bit more. So a parent contract can delete its subcontracts, move its subcontracts, and send and receive messages from its subcontracts. And if we take that tree of contracts and subcontracts and look at it from the top, you can flatten that tree into a set of nested circles like here, where the top circle would be the global blockchain. Then the circles inside that would be more contracts. And the circles inside those would be subcontracts and further. So this is also an analogous to membrane computing, which is inspired off of actually biological, modeling biological things like cells. Any communication happening within one of the membranes going outside the membranes must pass through a membrane. The membrane here is the circle. So having this membrane type system, we can impose locality. So if contract A wants to talk to contract B, for example, it would have to go through the membrane of its parent, A prime, and then have to go through B prime, and then finally to A. So locality here is sensed by the extra message passing. Contracts that are further away in terms of parent contracts have more costs associated with them, because there's more messages. And then you could use the same structure for shards. A and B could be sharded easily. And you could maybe have a recursive nested sharded scheme using this. You have subcontracts, and the subcontracts have subcontracts. And some of those would be shards. In reality, I don't know if you need more than maybe two layers of shards, though. So it's dubious in whether that's necessary. So problems with nested contracts, though, are they're a bit inflexible. It's hard to arrange if you have a bunch of contracts that communicate quite frequently that are distributed across a bunch of different parent contracts. Then it's hard to rearrange things to provide optimal communication. So they're also inefficient because of that. You have the extra message passing. So that was the first iteration in the EOSM kernel, experimenting with this idea. So the next stop was, OK, let's loosen the restriction of message passing up and down in a tree and subcontracts. While subcontracts are just connected by a graph. But we still want the property of imposing locality and finding a nice way to build shards around our contracts. What you can do here, the tree just becomes the shard boundaries in this case. But we still have a more efficient way of communication since contracts can be arranged in arbitrary graph. So this structure, interestingly enough, I think is represented very well by biographs. Biographs are something created by Algebra, created by Robin Miller, to model communicating agents in space and time. So I find it a nice fit. So here, the shard boundaries would be the place graph with a forest. And then the lines of communication would be the hypergraph. And when we put those two graphs together, we have a structure called a biograph. And that just gives us a nice halogen to work with to express these things. So communication here, it's a bit different than the way communication and contracts work now in Ethereum. So as it stands now, you need contract and send a message to any other contract. You might notice here, communication is more restricted. You have to have a line of communication. But this does have some benefits. It allows us to build isolation. Groups of contracts that we know cannot be contacted by other groups of contracts. And therefore, this allows us to build modularity. And I'll talk about this a little bit more later. Also, this makes it fairly easy to build concurrency. So the first idea to sort of express these lines of communication or graphs of contracts was channels. If you're familiar with CSP, this should look familiar to you. So a channel is just a bidirectional line of communication that is always one to one. And inside the contracts, channels are addressed by ports. Ports have names internal to the contract. So this blue contract here might call the yellow contract dog, the blue one cat, and the orange one something else. But it's only relative to that contract. The yellow contract might have a different naming. But these ports are also mobile, so the graph can rearrange. For example, if C wants to talk to B, or B wants to talk to C, but only A can talk to B right now, or C, then A can send B the port to C in a message. So now B can talk directly to C. Therefore, we can now define messages as two things, data and port references. And this might be, you might observe that ports are also capabilities here, as in the terms of capability systems. So it's a capabilities to send a message, and it's an opaque reference. It's unforgible. You can't create ports that are maintained by the system. And if you're interested in capabilities, you should check out eWrites. It was a pre-blockchain language used to express smart contracts, essentially. And they're also transferable, which I talked about in mobility. So now it probably should be apparent how we can build isolation. So for example, we might have a set of three contracts. One does authentication logic, ACL, right? The other does some sort of business logic, and then we have another contract that might provide a database-like interface. And no external communication can happen to the contracts in yellow. Everything has to have passed through authentication. And I think this helps us a lot to reason about security, because now we can just plug and play contracts, like business logic contracts and database contracts, without having to worry about security implementations, because all communication has to pass through the authentication or ACL contract. So if you were to audit the system, you would really just have to focus on the authentication. And you could just use off-the-shelf parts for everything in the isolated area. Isolation also helps us build concurrency. So knowing who can talk to who allows us to find mutually exclusive sets of contracts. And if we have mutually exclusive sets of contracts, then we can run them in parallel. So there's a few problems with ports and channels from the point of view of CSP. That is there's a system overhead associated with it, since there's always bidirectional communication happening for every channel. Whenever you move a port, then the system also has to update the other port on the new location of the port. Also, I think there's too much restriction, because I think it's still useful to have Ethereum-style message in where a contract can just send to a synthesized address, not an opaque address maintained by the system. So if we loosen the restrictions on channels, we just have addresses. But we still can maintain the idea, and all the good things with channels like isolation and whatnot, if we have opaque addresses. And opaque addresses were just like ports. Opaque address is something maintained by the system. You can't synthesize it. You just get a reference to it in your contract. And as ports and channels are bidirectional, addresses are not. They're unidirectional. So if I have an address to a contract, I can send that message to it. And I also can send the address to another contract. And then that other contract can send a message to the contract. They maintain the same properties as capabilities. So yeah, it has all the good properties and less system overhead, especially with regards to mobility. Because now when we move addresses around in messages, we don't have to update the corresponding contract that it points to on the system level. Also, we can layer synthesizing addresses on top of this to represent an Ethereum-style contracts where a contract could take in a string and then use that string to send a message to an address. So now we can really talk about synchronous calling after we have all the prerequisites. So synchronous calling currently works like this. A calls B. And after it calls B, it has to pull us. It can't do anything. Let's say A has 100 bytes allocated in memory. Well, we have to just keep that there. And then now B is running, and it allocates another 100 bytes. So we have to keep 200 bytes in memory. And A still can't do anything. Now B calls C. Now B is paused. So as you can see, we get this linear growth in resource consumption, where only the active contract can use resources, but none of the other contracts can. They're frozen, so it's going to waste. Also, atomicity in the second thing is really expensive. So not only do we have the resources in terms of memory and computation frozen and going to waste, reverting adds a lot of overhead or atomicity. So for example, when A calls B, now we need to keep A's modified state and B's modified state. Then when C, when B calls C, we also have to keep A's modified state, B's modified state, C's modified state. Then when C's done running, we still have to keep around C's modified state and B's and A's. And then when B's done running, we have to do the same for A. So if you loosen this restriction and have just contracts running concurrently, well, what would happen is from a system point of view, it would run. It would have 100 bytes located in memory, and then it would be freed. Then B would run and C and so on. But one problem with this is you still might want to represent atomicity. And you can do this using something called a two-phase lock. And we can provide some system primitives to make this easier. So essentially how this would work is, we'll say, B's performing a two-phase lock. A would call B. Then B would return to A and lock. It would basically give A a lock. Then A would do something and then return to B until it to unlock. And the important thing here is that the commit or the revert can be enforced by the system, so we don't have to implement a full two-phase protocol. Oh, I think I'm out of time. But OK, so that was the gist of it. There's more stuff we can talk about later. All right, thank you, everyone.