 Hello. Welcome to our presentation on trustless validator pools in ETH 2.0. I'm Dankhard Feist, and this is my colleague, Carl Beekwiesen. We are both researchers in the ETH 2.0 research team. And we'd like to talk a bit about our efforts to make trustless staking possible in ETH 2.0 while we're doing this, how we designed the protocol to make this possible. And then in the end, Carl will talk about the actual implementation of this, how it would work. So yeah, this is the outline of the talk. So motivation, why do we need trustless staking pools? Why we want this as a primary goal in ETH 2.0? Then how do you design a protocol? So the technology, basically, that is needed to have trustless staking pool secure multi-party computations. And then there are basically less and more advanced ways to have these trustless staking pools. You can have a basic algorithm that enables you to have a sort of trustless staking, or you can actually, if you want, extend this to also a fault attribution so that if your honest 2-thirds majority assumption fails, you can still, yeah, with very high likelihood, not lose your money if you're in such a pool. OK, so what's so cool? Why do we need trustless staking pools? So the first thing you have to know about Pro of Stake, like in ETH 2.0, we need to fix the amount that is staked by every validator to roughly the same amount. And why is that? It's very difficult to design a protocol that works in a fair way with vastly different stakes. So for example, when you sample people for committees, you don't make any adjustments for their stake because it would just be too complex. And so the protocol is designed so that it only really works if what everyone has staked is roughly similar. It can vary slightly, but if it varies too much, then the assumptions will fail. And so the amount that we agreed on is $32 ETH, which is like, in today's price is about $5,500 US. But say one ETH goes to $10,000 US, which is very well possible from what we know, then once one validator would have to stake $320,000 ETH, sorry, US dollars, which is great for security because then suddenly we would have our security assumption and proof of stake, of course, depend on how much money is staked by the validators overall. But it would increase the barrier of entry to such a high amount that very few people could actually afford becoming a validator from the capital cost point of view. And so that would not be what we're aiming for to just make validating another easy income stream for the rich that's not accessible for everyone else. And since, as I mentioned before, it's not really an option to just make staking possible with any staking amount. The alternative is that we create staking pools where several people can come together and say, we want to run one validator together and everyone puts in part of the required capital. And so the nice thing about when we have trust for staking pools, as we're going to talk about in this presentation, then they can still be decentralized. So having a pool doesn't mean, oh, I know Jeff is a trustworthy guy. We'll just everyone give him their money and he'll run it. But no, we can actually do this so that everyone runs that validator together in a multiparty computation. And you don't need to fully trust that any of the guys is completely trustworthy. And there's actually also a second very good reason to do this, which is that even if you are a single guy putting up the deposit, you might not actually want to run your validator on just one machine. Because the thing is to run the validator, you need to have the validator secret on that machine. And that's, of course, a huge potential security of risk because someone hacking that machine could just do whatever they want with that key and potentially get you slashed. So a nice thing is if you have the technology to enable multiparty computation for validators, then that also means you can increase your security by distributing your key across, say, just three machines and say, you want to run it in the cloud. You don't fully trust your cloud provider. You can have one on Azure, one on S3, and one on the Google Cloud. So you avoid having a single point of failure. Cool. So let's come to the technology that makes this possible. So one thing is that we chose in order to sign anything on ETH2, we chose this signature scheme that's called BLS signatures. And basically the way it works is that it uses an elliptic group with a pairing. So what that means is that you have this pairing function. OK. So E that pairs two elements from elliptic groups with this linearity relation. So you can move this factor N from the first argument to the second argument. And you can also move it out of the pairing here. And so a secret key is just an integer. And a public key is you multiply your generator, which is G1, by your secret key. And in order to sign a message, you multiply that message. So that's a point. You map the message to a point in the elliptic group to your secret key times M. And in order to check it, you use the pairing equation. So you check that the pairing of your public key and the message is the same as the generator and the signature. And the amazing thing about the signature scheme is that if you look at the signature checking equation here, it's linear in the public key and the signature. And this means that you can do something like you can add two public keys and two signatures. And that will still be a valid signature for the sum of these two public keys and that message. So basically, you can just add signatures in order to create a new aggregated signatures. And this is amazing. And basically, it also enables many things that we do in ETH2 in the first place. Or in a way, it enables shouting in the first place because it means that thousands of signatures can just be aggregated into one single signature that can be checked once using this pairing. And you know that everyone has signed this correctly. So this saves a huge amount of data and computation. But also, at the same time, since it is linear, this enables something called Shamir secret sharing. So the idea behind Shamir secret sharing is let's say we have 10 parties who want to share a secret. What we do is we make the secret a number and we encode it by creating a polynomial that at 0 evaluates to that number. And we give every one of those 10 parties, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, one point on that polynomial. And now the degree of that polynomial determines how many of the parties you need in order to reconstruct the secret. So for example, here, I've chosen a polynomial of degree 3. And we know that any polynomial of degree 3 can be reconstructed using four values. And so that means this automatically gives us a 4 out of 10 signature scheme. That means you need four of the people and any four, no matter which four, could recreate that point at 0. And basically, due to the linearity of the BLS signature scheme, that means we automatically have the threshold signature scheme. And that means that we can design this m out of n scheme. And yeah. So basically, that gives us threshold BLS signature, which is the most important thing. Because most of the work that you do as a validator is signing things, like a testing block saying, this was a correct block. This is the state of that chart at that block. And now we can do that in a decentralized way. Right. And now, coming to one point that was kind of difficult in this, as we have one element of the protocol that inherently needs a more computation that involves your secret that cannot be solved using these aggregate signatures. And that's the so-called proof of custody. What's the proof of custody? The idea behind the proof of custody is that you want to prove ownership of data. So what we want to do is that when you sign that you have seen a certain amount of chart data, that you also prove that you have seen that data. Because otherwise, you have this so-called lazy validator problem that means that, oh, I've seen some signatures for this data, so probably it's fine. I'll just sign it without doing the work. And that's probably, OK, 99% of the time. But in the 1% of the time where you do have an attacker and who does something really evil, they could use those validator, those lazy validators, to massively amplify their attack. Because they only need a small number of signatures of some data, for example, that's not available at all. And then all those lazy validators would sign it. And suddenly you have this non-available data that's signed off, which is a massive problem for the chain. So the way we avoid this is by having every validator, whenever they sign that chart data, generate one extra bit, the so-called custody bit, or custody root. And that's basically a mix of a secret. It's the secret here and the data at every data block. And then you basically, so this is the original construction. This is good for understanding how it works. You compute a hash tree root of this whole thing, and then you take the first bit of the root. And basically, only if you have that secret, you can compute it. If you don't have it, you can't compute it. Someone else can't easily compute it for you unless they have that secret. And if you give away that secret, then we can slash you. So that gives you a very strong incentive of actually getting the data. Because if you don't have it, it's very likely that your custody bit is going to be wrong. Now, the problem with this is, how do you do that if you are in a validator pool? If you don't actually want anyone to have the secret that can get everyone slashed, it would be a massive problem if someone needs to have this whole secret. And the way we solved that problem is we found this new pseudo-random function that basically is very friendly to compute in a multiparty computation so that you can compute with many participants in a very efficient way as defined as using the so-called Legendre symbol, which is that we're just defined by, so the notation is a over p. It's 1 if a is a quadratic residue, modulo p. So if there's a number that squares to a modulo p, minus 1 if it's not. And then there's a special case that it's 0 if p divides a. But in a way, you could say that never happens because the prime we're using is so crazy big that this is like a 0 hash or something. It just doesn't happen. We normalize this to a bit because 1 and minus 1 are not really a nice thing to work with in a protocol. And then the PRF is defined by just computing this Legendre bit of the sum of the secret and the data. And the nice thing is that this is super easy to compute in a multiparty computation. I'll not go over that in detail right now because the time for that is a bit short. But basically, there's a nice way to just blind the whole thing. And when it's blind, you can do the actual computation in the open. And then it's very easy to reconstruct the original result from that. Yeah, and then basically we can replace the proof of custody using this pseudo random function. And that gives us an MPC-friendly proof of custody protocol. Yeah, so I've been working on this Legendre function for a while now because we really want to use it. The only problem with it at the moment is that it hasn't had a terrible amount of crypt analysis. And so we're currently working on improving that. So one thing I wanted to mention here is we set those bounties on improving the state. So we have both asymptotic bounties for finding any better algorithms and some concrete ones. They will soon be a smart contract for resolving these. So stay tuned. But you can already on legeondreprf.org slash bounties can already get those challenges. If you find a solution, just email me. And then you can also already claim them. Yeah, so basically you can win between 1 and 16 eth for finding basically keys for Legendre in different instances. The smallest ones are designed so that with a few months or so of compute time, you can actually solve them. Expecting them to be solved, but would be really interested in how long it actually takes. The most difficult ones, hopefully no one can ever solve. But yeah, we want to know if there are any algorithmic improvements that might change this. Cool. Yeah, with this I'll hand over to Carl. Cool, so moving on as to how we actually apply this. This is going to be a bit forced due to time constraints, but here we go. So there's a distinction to make here quickly between two ways of constructing pools. One is where you try use economic incentives and custodial relationships to manage a pool, which is more like the case of a rocket pool. Whereas this is more designed to be run in an NPC where you want to be involved in the pooling structure. So you don't want to hand your ether over to a pool, even if they are incentivized. So these are what you are required to do as a validator within E3.0. These are the primary responsibilities with their frequencies. And all of this is relatively cheap in terms of what needs to be done. And as you can see, things like the NPC-calculated logendre shows up once in epoch. And so these kinds of things are enabled by the dunkrod presented. Let's skip over that. So the obvious way to do this would be something like PBFT consensus for a pool, because we need a system that is safe but not live. Because if you ever commit something that is not a supermajority of your nodes agree with, then you've run into the scenario where you can get your pool slashed. A relatively easy way of achieving this is actually just using the BLS signatures. So you set your threshold as 2 thirds of your pool size. And based on this, if you have to propose, one of the pool members proposes. But otherwise, you see what your attestation duties are. This is available if you have a view of the chain. You compute the appropriate custody, but you sign an attestation only if this attestation you think is valid in your local view. And this is with basically the overhead of only BLS signatures plus the custody bits that dunkrod presented earlier allows you to have pooling, which is very cool. Unfortunately, that doesn't guarantee consistency because it's not a full consensus protocol. People can disagree as to the state of what the chain is at any given time. So basically, you can have some structure that runs PBFT if you run into some unhappy case where the pool gets out of sync. You can also make this slightly more interesting, which is where you have some kind of metapool, which exists between the pools. So as a pool member, you don't only participate in your pool, but you participate in this larger metapool. And this allows you to use the metapool to have false attribution, where you prove to the rest of the pool that this metapool that someone did something bad in your pool. And then if one of your pool members got your pool slashed, because the slash is not burning all of the ether, it's only burning a portion, you can basically take all of the negative penalty and put it on that one person who is bad and give the other people their money back. And in fact, it may turn out that you can get more money out. So you may make a profit if one of your pool members gets slashed, which is pretty cool. It depends on exactly how you construct it. So yeah, that's the basic construction and how something like this would work within YouTube.0.