 So, welcome. We are here to talk about Infura, how we're hooked on it, and what we're going to do about it. My name is Jason Carver. I work at the Ethereum Foundation on all sorts of Python tooling, as well as the Trinity Python client. Let's get started. So why are we hooked to Infura? Obviously it's delicious. It is an obvious thing for DAP developers to make it as easy as possible for new people to sign up and use their app. And similarly, plugins like MetaMask, Ignosus Safe, and so on have the same incentives. So Infura is a way to skip the install of the node, skip the syncing of the node, and get people on board. Fairly obvious setup why that's easier and more pleasant for a lot of users. And frankly, for a lot of casual users, you know, 80 to 90% of people, they're always going to minimize effort. They're very willing to make tradeoffs about trust and custodianship to make it easy. But there is a set of users that we're interested in, the sophisticated users, whether that's hobbyists or people writing scripts to interact with the chain, and they're willing to do more. They're willing to put in some more effort. So the problem is when they try that, they go out and look for documentation, they look for help in the apps, and they don't find anything. It's all built for that 80% user. And so they just skip over it, and they often don't have any trigger later to switch over to something different. So maybe that's okay. How bad is that? How big of a problem is that? Well, you know, there's some obvious ones, like concerns about downtime, you know, and if you're going to be a single point of failure, no matter how good their systems are, and they're very good. Perhaps the worst is that if you're a mix of bad hire, or someone cracks into their systems, and they serve you bad data, now they can't explicitly directly, say, send your funds from one account to another, or your assets, but if they can control the view of the world you have, it is pretty straightforward to get you to send funds or assets to whatever address they choose. For example, by resolving an ENS name to a different address than the real main network. And some other things like leaking private data, everything about who you are and what you're doing is going out to Infura and the other apps along the way. And slightly more subtly, the fewer people that are running nodes, the easier it is to spin up other nodes and do slight manipulations like eclipse attacks. So it's here, it's a problem, what are we going to do about it? Well, there's a lot of different avenues to approach this with. There is dedicated hardware like Dapnode or Grid Plus. There's people looking at the problem of how much storage it takes to use on your computer. There's people looking at peer discovery. And the thing that we've found to be one of the biggest pain points is just the amount of time that it takes to sync from a fresh start. These sophisticated users who are willing to try something out, they're hobbyists, they like to tinker. But they'll start up a node and try to sync it and two hours later have no indication about how far they are into that sync and say screw it, flip it off and go do something else that's more fun to tinker with. So we want to fix that problem. The reigning speed champion is Guts Fast Sync. That's the one to beat. That's about four hours according to their latest benchmarks. So even that is going to be a problem for these users. And so we want to take a look at that and see how much better can we do. And part of the reason we're asking that question is if we directly implement Fast Sync, we're going to do a lot worse because Python is never going to match or beat, go in a head-to-head performance race. So we zoom into Fast Sync and we say we look at what parts are taking most of the time and the majority of that is downloading state. There's things like downloading headers, downloading receipts, but downloading the state of a recent block which includes all the things like accounts and storage for contracts. That's the vast majority. So what are we going to do? It's time to upgrade our sync process. So this next generation of sync we're calling Beamsync and the ability it gives us is go from an empty node, no data, to executing a recent main net block in minutes, asterisk, with just-in-time state downloads. That's the main concept behind it. So asterisk, I don't want to lie to you. It takes longer than that right now because you have to find peers and that's unreasonably difficult right now. And header imports can take a while unless you use a checkpoint which has its own interesting side tracks. We will get back to those things, but we're considering that out of scope for Beamsync right now. So high level, how does Beamsync work? Roughly we combine the ideas of the way that Fast Sync works and the way that Stateless Clients works. So Stateless Clients gets all the data, just the data it needs to run a block and it dumps it at the end of the block. Fast Sync is about getting all of the data all up front and save it to disk. So instead Beamsync works by running the EVM on an empty state, pulling the data as it's needed, but saving it and storing it for future executions. So what kind of effect does that have? There are about 375 million trinodes in the main net state database and there are roughly 3 to 4,000 trinodes that are used per block. So that's about one 100,000th of the amount of data you need for just one block versus for everything in the state. And this works out to about 250 times fewer requests on the network before the first block can run and we'll get into those numbers a little bit more later. So we are approaching halfway through and obviously only glazed over Beamsync. So I've got a post here rather on Medium that goes into a bit more detail, compares it to Fast Sync and all that, so I'll give people a second to grab that link. And also I'm going to be posting further Beamsync updates as well as talking about how it impacts the whole network over time. So feel free to follow me there to find out more. And while people are getting their last shot of that, you may have noticed kind of a discrepancy in the numbers on the last slide. If there's 100,000th of the data, then why is it only 250 times fewer requests? Excellent question. So Fast Sync gets to batch together its Trinode requests into 384 nodes per request. So Beamsync as implemented now can only request a single node per request. And because the latency, the actual time that the packet is round tripping between you and your peer makes up the majority of the time, you lose a factor of 384. Okay. So we're about to do a deep dive. I'm sorry for the folks that I might lose. It requires a little bit of understanding about EVM and the way that tries work. I'm not going to have time to give background on it. So I'm sorry. For everyone else, I hope you have fun. So what I'm going to do here is I'm going to paint a big picture of kind of how it works, but I'm going to put it on the slide bit by bit so that we can kind of stitch it together, the understanding. So the idea here is we're going to jump into EVM code execution. This isn't the way it really works, but it gets the point across. So let's say we have a push 20 op code. We're going to push an address onto the stack. That didn't take. It didn't require any state. That was just in the byte code, so even though our state database is empty, we can continue on. Okay. But the next op code asks for the balance of that address. So now EVM has to check the state database, which is empty, and try to extract that balance. Now all we know at this point, all the node nodes at this point is the hash of the root of the try. So that is, we're going to call that f here, f dot dot dot, and that's all we've got. So what are we going to do? How are we going to get the balance? We're going to find a friendly peer, and we'll ask it, hey, peer, do you know what the node is that has the hash f? This command is the exact same command that Fast Sync uses. So we're piggybacking on this. We don't require any new network protocols to make beams sync happen. So we say peer, give us, can you give us f, and they say sure. The node f has children a and b. So what does that mean? It means that the hashes of the children of node f are a and b. You know, I'm lying to you a little bit here as pretending this is a binary tree or other than a modified Patricia Merkle tree, still gets a point across. So we get a and b back and we store it in our database. So here's what our local try looks like now. We've got two children. We don't know what are inside them. We have the root node. And what happens at this point, now we're looking for the balance of a particular address and we know which address we're looking for. So we don't have to fill out the whole tree. We can just follow the path to the address we want. So let's say the path takes us down a, and so we want to know what the children of a are. So we ask our peer, hey, what are the children of a? It says the children are D and E. We save that into our database. So now we're starting to build just a few pieces that are necessary to get at the balance of the address that we care about. And so we do the same thing. Again, we know a particular address that we're looking for the balance for. Let's say it's down, you're meant to follow down the path to E. And we ask our peer and they say, hey, E is actually a leaf node. It contains the RLP of the account, which includes things like the balance that you're interested in. So we save that in the state database. So now we've got three nodes in the state database. That's enough to not only know the balance but prove that that balance is part of the state root from the previous block. So at this point we can read the balance out of that leaf node and push it onto the stack and continue to resume going down the EVM. So let's say the byte code wants to know if the balance was not empty. So this is the gist of it. You can see, you know, later on maybe a different balance is asked for. Again, you'd be asking for nodes one by one. You get to skip the root node next time, but, you know, the same concept applies. Similarly, this is showing, you know, three layers. The real main net's closer to six or seven, probably at this point, depending on, you know, where the account is. So how long does that take? So let's say there are 3,000 nodes per block and we only get to request one node in each request. And we're connected to peers, but we can only ask one peer at a time. So we get to ask our best peer. We get the one that, in this case, best means we're closest to, that will round trip fastest. So let's say that our best peer is 100 milliseconds away. So you work that all out. You get 300 seconds of time waiting for state for a block. And that kind of sounds like a problem. You can imagine if it takes five minutes to download the state for a block, then you're going to lag behind the net. And then as you process the next block, you're going to lag further. That would not be tenable. But the good news is we don't have to wait for the first block to finish before we start executing the second one. So what ends up happening is you continue to run these all in parallel and you catch up along the way, but maybe stay perpetually five minutes behind main net. In reality, it might look something more like fluctuating between 15 minutes behind and one minute behind, depending on the blocks you're running into and your peers and all that. So at this point, you're a few minutes, you've turned on your node, you're five minutes in, maybe some more minutes for peer discovery and headers. And you've got main net blocks executing on your local node. This is already a way better experience than four hours or sometimes days to run FastSync. But it's not good enough. So how can we do better? One of the things we can do is to find out from our peers which nodes, which trinodes, are going to be needed in a block, right? So we're going to call that the block witness metadata where the witness is all the state needed to execute the block. And the metadata is just the hashes of the state that's needed to execute the block. So what that allows us to do is batch up the requests again into 384 nodes per request and it allows us to spread those requests across multiple peers. So a quick look, this is going to look very similar to the past one. So we'll go through a little bit faster of what that might look like. So push an address onto the stack, ask for its balance, hit the empty state database, all we know is the root hash, find a peer to help us out. And now instead of asking for that root node, what we do is we get a witness, really I'd probably call it witness metadata or something. And we know which block we're on, of course. So we can say, hey, peer, can you tell us which hashes we're going to need for block G? You say, sure. We're going to need F, A, and E from last time. So at this point, we actually can't store anything in the database. We just have a bunch of keys. We don't have any key value pairs. And in fact, we don't know how these stitch together. All we really know is F's at the root. But we can use it to now make the next request. So we can batch them together, put them in a single request. We can send it to a different peer. We can split it up and send it to a bunch of different peers. We have a lot of options. And so we get back, essentially all the data that was requested in the previous slide. So we know F has children A and B, A has children D and E, and E is the account RLP. So we save it into the database. That's enough. And prove the account balance and push it onto the stack and then ask follow-up questions like is that balance zero? Which is a stateless call. OK. So how much does that help us? Well, let's run some numbers. So let's say, again, 3,000 tri-nodes per block, rough number. But this time we get to group it into 384 at a time. We're, again, assuming, and this looks about right from empirical tests, that most of the time spent is latency. So we get to group it up and we get to also split it up. So we get to send these requests to, let's say, four different peers at once. And now that we're sending it to more than our best peer, maybe the average round trip time actually goes up. We have to rely on some peers that are a little further away. And so maybe each peer takes 500 milliseconds to round trip. And so all of that throws in an arithmetic at it and you get one second of time waiting on tri-node requests. So this is a prediction. The other one is known. But this is a fairly straightforward extension that we're looking at that would be pretty much the whole game. If you can download all the state in one second for a block, you can keep up with mainnet quite easily from the beginning of launching an empty node. So how close are we? Well, V0 is prototyped in Trinity Alpha. That works on mainnet right now. So we have executed many mainnet blocks over and over and generated local witnesses, that kind of thing. But it's not production ready. This is meant for experimentation right now. But there's nothing left to research. There's no open questions really on V0 and how it works, just more coding to do. Now the witness metadata serving is in its design phase. So we have ideas about how to do that. That's going to be coming up next right after DevCon, follows to see how that progresses. And people sometimes ask the question, well, why hasn't anyone done this before? And the answer is that no one had to. We were forced to do it because Python is slow. So we had to ask these questions. There's a chicken in the egg problem with why we couldn't use witnesses right away in V0 because there were no servers of those witnesses. And so Trinity can't serve them until it syncs. So we're going to bootstrap with V0 in order to sync V1. So that's it. We're doing this now. We're talking to other clients to get it done for them. It's a great thing about working at the foundation is we get to share all the fun toys that we make up. And we're cranking away. So follow us and see what's new. It looks like we don't have time for questions. But you can find me on the side or in the hall. I'm happy to talk. Thanks for your time.