 Hi, my name is Michael Sproul. I work at Sigma Prime in Australia, all that distance away, on the Lighthouse F2 client. And today I'm going to be talking about optimizing Ethereum 2. So with Ethereum 2, we have the benefit of having a nice executable specification. It's written in Python. And it's written with a focus on being very clear, very readable. Now with Lighthouse, our implementation is in Rust. And we have a focus on being fast and secure, which we also want to be readable, but most of all, we want to be secure. And performance also helps with security because it helps us avoid denial of service attacks. So where the spec might prioritize readability and use some quadratic time algorithms, in Lighthouse and in most F2 clients, you really want to make sure that you're running quickly and using linear time algorithms are pretty close to that, particularly over the set of validators, which could be up to 4 million validators in a list. So first of all, one example of where the spec is slightly inefficient in how it phrases things is the shuffling of validators. So we use this one at a time sort of shuffling in the spec, the swap or not shuffling. And the spec says for each index, which index does this get shuffled to? And it does each index one at a time. And it also does this thing where it will extract committees on demand. So if you get an attestation from the network and you need to know, well, who are the validators in this committee for this attestation, it'll compute the shuffling and then extract the committee on demand and then throw it away and recompute it when you get another attestation for the same committee, which is not so efficient. So why not just shuffle all the validators once at the beginning of the epoch, cache that shuffling, and then read the committees off that on demand as you need to? And that's exactly what we do. And we use an algorithm that Proto Lambda came up with. I think Proto is probably here somewhere. And it ends up being 250 times faster than shuffling the list one at a time. So 250 times speedup is not bad. And then there's also the benefit of not redoing the computation each time you get an attestation. So it's 250 times or better. In Lighthouse, what this looks like is we have three caches of shuffled validators. One for the previous epoch, the current epoch, and for the next epoch. And when we hit an epoch transition, so when we move from one epoch to the next, we update these caches by shifting them along by one and then computing from scratch the shuffling for the next epoch. And this works perfectly so long as you do know the seed for the next epoch. And by design in ETH2, we are meant to know the seed for the next epoch. So when we're transitioning from B to C into the current epoch, we already know the round-out mix and the seed for the next epoch, and we're able to compute that shuffling. At least that's the case with the spec today. When we were implementing v0.8 of the spec, we noticed that our next epoch cache actually broke. And we were going, oh, this is weird. Like maybe, you know, maybe the spec isn't meant to be like this. And we went digging through some docs and we were going, oh, this is really strange. You know, the round-out mix is updating right till the red arrow just before the start of the next epoch, which means that the block proposer at the start of the current epoch here has less than one slot of notice that they are the block proposer. So, you know, they're kind of doing their epoch transition and then they're going, oh crap, I'm the block proposer. I better get on this and propose a block. So we clarified this with the spec and found out that actually, no, it's not meant to be like this. And really, we should be looking at the round-out mix from two epochs back, so where the green arrow is there, and that was fixed in V0.8.3 of the spec. So something surprising here is that by implementing an optimization, it actually allowed us to discover a bug in the spec. And I think this speaks to something more general in the Ethereum space, which is that by having a diversity of implementations in different languages and with different optimizations phrasing the same thing in different ways, that can actually lead to clarity of the specification. And I think that's something that's really important and it's a good way to design software. Let's talk a bit more about epoch processing, because there's a few more optimizations that we do around this, maybe less important than the shuffling one, which is such a massive speedup. With the epoch processing, the spec occasionally will iterate over lists of attestations and validators kind of redundantly. And one example of this is when you're calculating the reward for a proposer based on the attestations that they've included in their blocks, it uses time O V times A. So it's kind of a quadratic key time where V is the validators and A is the pending attestations. And really that validators, because that list is so large, we really don't want any sort of quadratic factor in there with that. This is the code from the spec where you can see the nested for loops giving you the quadratic time thing. So there's the loop over the validators and then the loop over the attestations within that. So rather than doing that quadratic time thing, in Lighthouse what we do is just a linear pass over the validators and the attestations. So we do it in three parts. We first go over the validators, get some basic info, like whether they're active in the current epoch, things like that. Then we iterate over the attestations to find out how people voted in things, whether they voted on the correct Casper FFG targets and sources, and then we do one final pass to sum up some balances. So different types of total balances for validators. And in total that's order V plus A time. And some of the total balances just as an example, these are the totals we compute. So the total balance of all the people who were active in the current epoch, all the validators who attested in the current epoch, who were tested to the correct target, et cetera, so on and so forth. As I said before, usually when you implement an optimization, you run the risk of breaking your client and running a foul of what the spec says you should be doing. So what we really need to do is when we implement an optimization, we need to guarantee that it has the correct behavior and isn't gonna break out client. So in roughly increasing order of strength, we've got looking at the code, looking at the spec, looking at the optimized version and just kind of grokking that they're the same. That's the weakest guarantee you can do because people are pretty terrible at that. And then moving into like unit tests, the Ethereum Foundation's test vectors, which have been super useful, and then into fuzzing. So we've been doing quite a bit of fuzzing on Lighthouse, both crash fuzzing to see, make sure functions never crashed regardless of what inputs you give them. And differential fuzzing, comparing two different implementations and Medi's doing more work along those lines in the next couple of months. And also similar to that, randomized testing, property testing, similar to quick check. And because I've got a bit of a formal verification background, I've got this itch that I haven't scratched yet for formal verification. And we'll see if I get around to scratching that. Yeah. I think I've got time, so I'm gonna do this section as well. As well as optimizing for performance. Another thing we can optimize for when we're making an F2 client is the profit that the validator will bring in for being a validator. And there's one interesting problem here that caught my eye a few months ago, and I'm maybe a bit obsessed with it if you talk to anyone around CIGP. And that is the attestation inclusion problem. So this problem is kind of leading on from what we were talking about before with aggregating attestations is if you've got a whole bunch of aggregation, if you've got a whole bunch of attestations from the network, and you've got more than you can fit in a single block, deciding which ones to include in the block such that you maximize the profit that you get from the rewards is actually an NP hard problem. And it's an instance of this classical computer science problem called maximum coverage. And just to show you exactly what that looks like. So attestation inclusion says something like we have a whole bunch of attestations and attestations and we need to find a subset of these of some maximum size such that the sum of the rewards we get for all the validators that we've covered with those attestations is maximal. And the abstract version of this problem, weighted maximum coverage says we have some set of sets and we need to find a subset of those sets of the maximum size so that when we union all the sets in that subset together, the combined weight of all those elements is maximal. The problem with NP hard problems is that they're hard. And usually solving them exactly requires a sort of exponential or semi-exponential time algorithm. So for now, we're using a greedy algorithm that works quite simply by just starting with an empty solution, looking at the list of attestations that you've got and greedily choosing the best attestations repeatedly and adding them to your solution. And so the best attestations are gonna be the ones that cover new validators that are not yet covered by attestations included on chain or attestations already in your solution. And that include of those validators, the ones that include the most high balance validators because the reward that you get paid is proportional to the balance of the validator whose attestation you include. This greedy algorithm performs quite well within a factor of one minus one on A of optimality. So it's always gonna get you at least 63% of the maximal reward that you could get. And in a lot of cases, it will do better than that. The sort of pathological cases that hit this lower bound are kind of unusual. But nonetheless, I would like to look into doing some exact solving using intergillinear programming at some point in the future. I think that could be fun. But it might require kind of scheduling your block production well in advance of when you actually need to produce the block which could be not so good. Yeah, so in conclusion, optimizing S2 is a lot of fun and all the clients should definitely be doing it and I'm sure they are. There's a link between performance and security. So avoiding denial of service optimization is important. And if we are all optimizing and writing things in different ways, there's a chance we might find some more spec bugs, which is also lots of fun. So thank you very much.