 Hey everyone, hola. What an awesome stage to give my first DevCon talk on. It's fantastic to be here. Thank you for taking the time to come and listen. So I've got a fair bit to cover. I've had to kind of keep cutting slides all week because I realize how small 25 minutes is. So let's get into it. My name is Adrian Sutton. I'm a lead blockchain protocol engineer with Consensus. I started out four and a half years ago actually working on Hyperledger Basu before it was first released back when it was Pantheon. And more recently I've been focused on TechU, the consensus client. So I've had some experience with both the consensus side and the execution side, building new clients and bringing them and getting them ready for production and then going through and yeah probably the highlight of my life now will be having having seen the merge complete and go through. So very proud of that and very excited for Ethereum. What we're going to talk about today is some of the things that we didn't actually kind of get done and some of the consequences of the way we did the merge which was fairly minimal deliberately so we could get it done so we get rid of all that burning proof of work. So I wanted to tell you everything everything you ever need to know about Ethereum clients. I'd love to talk to you for a couple of hours about that about consensus and execution layers but I've got 25 minutes. That's a bit of a problem. So we're gonna cut the scope down a bit and we're going to skip over some of the finer details. We're focusing on the more high level stuff here, the architecture, how things fit together. This isn't the kind of talk where I'm going to tell you about the nitty gritty details of how the engine API works. A little bit of a assume knowledge that you know Ethereum here but I think that's pretty safe for this audience. I think you should all be able to follow along pretty well. To give you some background, there's three key things that I want you to know about Ethereum clients and that'll form a real grounding for the things we're going to talk about post merge and what they mean and the impacts of that. So the first thing is there is a lot less difference between consensus clients and execution clients than most people believe. Yes, they do quite different jobs. Yes, at every opportunity when we were designing the beacon chain, we said, nah, mate, we can do that better. Let's pick a different technology. So there's basically no shared code between the two. There are a lot of shared concerns and concepts and there is a lot that we can learn by talking to the other side, by consensus client developers talking to execution client developers and vice versa. Networking is probably a really good example of that in that the core concept of it really has three parts. The first is discovery. We need to find peers. The second is gossip so that new information flows around the network quickly. And the third is request response, which is the ability to say, hey, I missed some data. Can you give it to me or I'm sinking? How do I catch up? Each of those are quite different technologies. For discovery, consensus uses discovery v5, execution uses discovery v4. For the other two, it's libp2p versus devp2p. And in terms of the formatting on the wire, it's SSZ versus RLP as well. So completely different technologies underneath it. But still the same concern of how do we find peers we can trust? Because the world is out to get us in client development. We don't trust anyone and there's a sea of peers. How do we find the good ones? How do we find the honest ones and how do we get a good view of the network and know what's going on? Things like civil resistance are common across clients, DOS prevention and those peer-scoring algorithms. The trade-offs between reducing latency when we're gossiping data by broadcasting it to more peers and then the duplication of that data and the increase in bandwidth. And then ultimately they're both blockchain clients. They're both dealing with this tree of blocks and this concept of reorgs, which is fairly unfamiliar for people who are new to client development. This idea that the world is fixed here, it's not like a single database, suddenly we realize, hey, there's a better chain over here and we reorg. How do we pull out, say, transactions or attestations and other operations from the chain we were on and check that they all made it onto this new chain? Lots of code and lots of logic flows around that and lots of impact throughout the client in the way it's designed, the way it works and the kinds of things we have to think about when dealing with algorithms. So on the consensus side, probably the big thing that we keep coming back to and drives a huge amount of our decisions and has impacts everywhere is the fact that proof of stake is not just a block driven blockchain, it's also time driven. So even if no blocks are produced, the world state continues to change now that we're in proof of stake. Every 12 seconds is a slot, whether it's empty or not, we apply rewards and penalties and concepts for validators. And as a result of that, there's a fairly limited amount of time for blocks to be produced, gossiped around the network, imported into other clients and then attestations produced. And that's that first four seconds of a slot. It's pretty tight and particularly post merge, we're seeing more and more that sometimes you don't quite make it, that blocks are coming out late, that they're taking too long to import and then attestations are being produced before that block is actually ready and so we get a wrong head vote which reduces the validators rewards and slight impact on the performance of the network. So if you ever propose something to a consensus client dev and it's going to slow down that process of getting a block out and then brought into the client ready to be attested to, we're going to have a hard time and we're going to push back. On the execution side, it's a bit different. Time is still important, obviously in performance, but the big thing is this world state, a beacon nodes database is in the order of 50 gig to 100 gig for the total database. On an execution client, it's more like 500 gig to a terabyte. Very rough numbers, I told you it'd be high level and hand wavy. But you know, this order of magnitude difference in size, that has all kinds of implications for performance. And it means that there are all kinds of different approaches to how you handle sync, to how you prune that data and manage it as it changes, how you index into that data, that impact then flows through the rest of the execution client. So every time we're touching world state, everything that has has operations to do whether it's transaction gossip and so on, all comes back to this world state and getting access to it. And that's what drives a lot of big things for execution clients and we want to give them scope to keep inventing new ways of dealing with things. Okay, post merge world. Now that we've set the scene. The deployment model, you've probably gotten a hang of these days, we have two separate clients, you need to run both a consensus client and an execution client. And they both connect to their own P2P networks for gossiping data and retrieving stuff. The blocks come through the consensus client, pass through the engine API to the execution client. And it's, it's that engine API that allows the consensus client to control the execution client. And this is one of the first things people notice as something we probably should clean up post merge of why do I have to run two pieces of software for a single Ethereum node? Well, there's a couple of solutions we can look at for that. The first is embedding a light consensus client into an x into each execution client. I really like this idea. It's simple for users to run a node. It reduces the system requirements. Like client consensus is a lot less impact and a lot less bandwidth and CPU than the full consensus protocol. But it does come with some trade offs. You will always be one slot behind head because you've got to wait for the sync committee to do its job. You can't run a validator node this way. And there is a reduction in security guarantees. For most nodes, like if you're the type of person who really just wants to run one thing because you're running a home node, and you just want to send some transactions, it's probably a really great solution. But it's not for everyone. So often people suggest that we should combine clients, we should take a consensus client and mash it into the same process with an execution client. I'm a lot more skeptical of this. It does make it simple for users to start a node in that they've got one process. But ultimately, it's really bad for client diversity. We've got something like five consensus clients and four execution clients to choose from as 20 different combinations. Are we really going to maintain 20 different combined clients so you can run them as a single process? And who's going to do that? And even if we were prepared to do it, there's all kinds of dependency conflicts that we're going to get into because all of a sudden, two separate development teams work is being pushed into a single process. And to resolve that means development teams working much closer together and being restricted by what they're doing. So we get a lot more cognitive load for core devs, a lot more coupling. And we lose this nice separation that we've somewhat accidentally got between consensus and execution layers. And that's that's a real disadvantage for the Ethereum ecosystem because that separation and the ability to have people who specialize in consensus and people who specialize in execution allows us to build stuff a lot faster and build stuff much more in parallel. The final thing is it encourages client scope creep. And that might be a little unexpected here. So let's dig into it a bit more. This is what a real Ethereum deployment often looks like. And it's not as simple as the first picture. Because we learned ages ago, even before the merge, right, we let ages ago that putting your wallet inside your execution client is just dumb. It's really bad for security. And so we stop doing it. You know, if you've managed to load your private wallet key into guest still and you're using those key management APIs, please stop. Use a real wallet. And so you're already using two processes. You're already using two different things to access Ethereum. Then not every consensus client supports running the validator client in the beacon node at the same time in the same process. Some do. Some don't. Very common requirement for people wanting to set it up to have a separate validator client for various reasons. And you might even take it a step further and want to secure your keys more. And so you use an external signer that possibly has an actual database for slashing protection. You probably should be monitoring your 32 E's investment at minimum. So you get something like Prometheus. Everyone likes flashy graphs. So now you've got Provano. And then you start adding on things like MevBoost. So all of a sudden, we've got a lot of stuff going on. It's, you know, you can't imagine all of this being put into one process. It's not viable. It's not what we want to do. And we don't want to lock people into saying, hey, get started with Ethereum. Here's a single process. Oh, you want to do something well. Ah, yeah, throw out everything you learned. Do it a different way. Doesn't make a lot of sense. So what I think is the real solution here is ethOS. Important note, please, don't make it an actual OS, right? We've got good operating systems. But this idea of moving up a layer of coordination is going to be really important so that we can maintain this decoupling of consensus and execution clients. They don't need to know too much about each other. They use the standard APIs to communicate. But we can start with, here's a single process you run, and it will spin up your execution and consensus client and have them hooked together ready for you. Then if you want to add a validator, it'll add in a validator client for you. It's just another config option without having to start again. Same for Mev, same for metrics and so on. The really good news about this, it already exists. In fact, it exists in multiple forms. Things like East Docker, East Wizard, Dappnode, Stereo, all have some version of this that makes it easier to run a client, kind of hides the details of the fact that there are multiple processes and coordination required. Of course, the downside is exactly that. We're hiding the complexity. Underneath the hood, lots of different things going on, lots of moving parts, and we need to be very careful to ensure that we have good definitions between each of those parts. So things like building the execution and the engine API well and making sure that all those little details get fleshed out and documented and so on, and that over time we keep making it more robust. Okay. Beyond the initial deployment, what are some of the other problems and opportunities that we now have because of the merge and the way that these two clients work? The first one is a pretty big one. We got a bunch of data duplication. So the consensus client blocks include the execution payload, which is what your execution client sees as a block. Kind of by default, both clients wind up storing that data. So we've duplicated the execution payload, which is what contains all the transactions. It's pretty big. And that increases our disk space requirements overall. So you will probably see if you're running a beacon node, post-merge, its database will grow in size faster than it did pre-merge because of these extra transaction data. There are a couple of ways we can address this. The first one is that there's a proposal that's been around since pre-merge, but just execution clients didn't have time to deal with it, basically, and I'm not sure consensus clients did either, for get payload bodies. So what this is, is it's a fairly efficient way for a consensus client to say, hey, I need the payload for this range of blocks. In that way, the consensus client can simply stop storing it, and instead it can rely on the fact that the execution client will be storing those blocks anyway. The consensus client does need that data and it needs it fairly quickly when it gets it because things like queries to its REST API, if you ask for a block, it's got to give you the full block, so it's got to go and get it. And the same for networking when peers request a block, it's the full block that we need to send back. And that in particular we want to be fairly fast, and we really want it to be efficient. That's probably the big challenge we have with this deduplication right now, is that we're asking the execution client to store data and give us access to it on our terms. If you remember, the big thing for execution clients was managing data. They have a lot already, their databases are under a lot of pressure, and that's the performance bottleneck. So Lighthouse has a feature where they will kind of emulate this get bodies, this get payload bodies through the existing JSON RPC, cool tech, well done. And it turns out it puts a lot of pressure on the execution clients. Sometimes it works well if you've got enough IO, sometimes it hasn't, and they've added an option to be able to turn it off and just go back to duplicating the data, which for some people has worked out better. The other issue that floats around here is that we get an increased coupling between the two clients again. Not so much in terms of the logical side of it, in terms of what their responsibilities are, but more in that if your execution client is re-syncing, your consensus client just lost a bunch of block data it thought it had. And there's a lot of complexity around how to deal with that. It's reasonable, you'll probably just start getting errors from the REST API and say, oops, don't have that block. All of a sudden, same for peers, it's quite possible that you don't have that block anyway. All these details can be worked out, but it's kind of part of cleaning up the mess we left behind when we got the merge done. The other solution, which is a bit easier and kind of cheating, is let's just not store all the blocks. So we've talked about this with EIP 4.4s on the execution client side for quite a while, and the consensus client learning from that actually put in the spec that we don't have to keep all the blocks all the way back to Genesis. It's never been specced that way. It's about five months that we need to store. So if we just started pruning those older blocks and deleting from the database, even with this data duplication, we wind up with a pretty small database for our beacon node. It's probably enough. It certainly would be for a lot of people, and we can do the deduplication as well to make it even smaller. The really nice thing about that is that for consensus clients that aren't trying to form an archive node, that would make their disk usage almost static. It would grow, but only really slowly as the number of alilators grow and the beacon state grows, which is relatively small. That way, we're kind of avoiding this ever-growing creep of disk space usage that we've kind of seen with Ethereum because of the growing world state. It's nice that the consensus client doesn't necessarily have to have that problem. Drawback is, obviously, if you're running an archive node, then you want to store all the blocks, you want to store all the data, and you're really going to want to lean on the deduplication to save disk space instead of this approach. The other part, which things like the portal network are aiming to solve, that those older blocks are harder to find. You can't just request them from any peer on the network. They potentially become unavailable. It's technically okay, but it doesn't feel good for a blockchain to kind of lose old data. So it's not really something we want to encourage and we want to have systems in place so that we do hold on to them. There's a bunch of research going on there, for example, era files are defined, and we can use a number of things like that. Okay, so the other place where we see an interesting interaction between the consensus and execution clients. Oh, that water didn't help. Yeah, so this interaction between the two clients in terms of non-canonical blocks. So blocks that we receive, that we don't believe are part of the canonical chain. Maybe we receive them really late or whatever. For an execution client, particularly ones that only store one version of the world state, rather than the full tree of all the world states, they're really tempted to want to just not bother executing the transactions because they don't want to store that world state. They're only storing one anyway. And so the engine API allows that and allows them to say this isn't worth executing. I'll store it, and it's here if it becomes canonical, sure I'll execute it, but I'm not doing it now. And they can return accepted. The problem is, on the consensus client side, we have to track that that block isn't fully valid. So the execution client didn't execute it because, you know, world state's big, that makes a performance problem, that kind of thing. On the consensus client, though, that delay in execution might become a problem if we have to reorg over to that block. Because it's going to add delay right at the point where we're likely to not want it, that time sensitivity thing. So we're about to create a block and we've realized, hey, we're on the wrong chain, we're going to reorg over to this one. All of a sudden we've got a block we don't know is valid that we think is now the canonical chain. We have to wait for the execution client to execute it before we can safely perform any validated duties that are involving that block. That might delay the validated duties, best case. But it may just cause them to be missed. So it's going to be really important to have some understanding of these different pressures between clients. Because for short reorgs, it's actually really important for the execution client to go ahead and spend a bit of CPU and disk IO to execute all those blocks to avoid potentially causing the consensus client to be in a time pressure scenario. But then for long reorgs, that's probably okay. It's less common and much less likely that we'll need to reorg to something that would cause a long reorg. Danny Ryan's talk on the opening session particularly showed that where we're just not seeing many reorgs at all, which is fantastic. Okay, so summing this all up, there's a bunch of other problems that are there and clean up that we need to do these loose ends. But they are loose ends. They're not going to require more hard forks to fix. They're not going to require massive amounts of effort. It's just the normal day-to-day plugging away of engineering. So keep up with the latest versions of your clients and life will keep getting better. I think we want to continue to learn from the other side. Consensus and execution clients through the merge process have really started talking to each other a lot more than they did back before. We really work focused on the merge. We want to keep that going. We don't though need execution client devs to be an expert in consensus layer. That split is powerful and we want to really leverage it and look for other places where that kind of split can be useful. And that's part of embracing this kind of multi-component future of Ethereum that you don't need to know everything about every component of Ethereum. You can know your little part even just within the layer one. That's all I've really got time for. So you know thank you for listening and I do believe we have some time for questions which would be awesome. Yes we do. We have time for questions. Do we have any question from the audience please raise your hand. Hey cool talk thanks so you mentioned multi-component embracing that aspect and I think for the most part you know we're on our way towards that. I think one area where that is not quite the case is between like the beacon node and the validator across clients right for like prism or lighthouse whatever. Do you know is there a roadmap with like Beisu for example or sorry Techu for that to allow like validator Techu validator to talk to like a prism beacon node and like what are your thoughts there were the challenges to be able to accomplish that. Yeah absolutely I think it is really useful to be able to have a Techu validator client talking to a lighthouse beacon node. The good news is that the standards are there for that. So from right back before the beacon change launch this is theoretically possible. In practice it's been a bit hit and miss. We've actually seen really good compatibility with everyone else except prism. So prism uses gRPC. They kind of got in first and did it and we didn't listen to them and did a REST API because we liked it more and kind of outvoted them so we put them in a hard place there. So their validator client uses gRPC. They do expose the standard REST APIs so most clients can their validator clients can connect to prism. Prisms validator client generally can't connect to anything else. Otherwise they generally interrupt pretty well and particularly for Techu because infura was exposing the beacon node APIs which they've definitely removed now because people kept running validator clients against infura which is kind of terrible. So it is possible. It is something that we need to ramp up testing on more every so often we find a little corner case that didn't work as well as it should and those kind of things but yeah a lot of the the base layer is there for it. Mostly it works and I think particularly with post merge it's going to become more and more common because you so often use a validator client that fails over between two beacon nodes now. There is another question. Another Hey Adrien thanks for the talk. I want to ask what are your thoughts on current design that the consensus client is tightly coupled with the execution client. I mean one consensus client has to talk to one execution client that sounds like what's the reasoning behind it. Another thing like what do you think about DVT distributor validator tech and how will it influence the future of clients. Yeah so on the first part there it is pretty much and I probably should have mentioned this that one consensus client generally has a tight coupling to one execution client. The key reason for that is that the consensus client is writing and updating and controlling the execution client. There are some solutions and I think this is part of the multi-component stuff that you can actually run a middleware layer that will take the option that the calls from the consensus client and spread them across multiple execution clients. I love that. I love that it's built as a separate thing and not something that is a client dev I have to do myself because it's hard and there's a lot of different policy scope there in terms of do I trust this is my primary execution client. You know what if they disagree. Do I want two out of three to agree. All these kind of details we can build that really well with a dedicated product rather than trying to build it five times once in each consensus client. In terms of distributed validated technology I'm not a huge expert but again that's pushing towards becoming a middleware layer and I think that's again really powerful and really good. It's certainly it's a great technology to help de-risk some of some operations with validators and have unique deployment models that are that are around there. So it's something I'm really keen to see keep evolving and keep kind of finding new ways of doing it and ideally that we find ways of adjusting pretty much the the ventilator APIs that we use those rest APIs I talked about before adjusting them so they're more and more conducive to an external piece of software just being able to handle it all and handle it well. There might be some trade-offs there we might hit some limits I don't know but yeah definitely a technology to keep investigating. Thank you. Oh thank you. Thank you very much Adrian was an amazing conference and please a big applause for him. Thank you everyone.