 There we go the slides are not so performant, but they are safe So I wanted to give an overview of Ethereum 2 for information security professionals So what I want to do is sort of show you where we're looking in the protocol and the implementations And I'm hoping that people will be inspired to come along and perhaps fill in the gaps If you can see we're missing something or assist with the things that we're working on now and work with the other teams So I'm going to be talking about validator safety I'm going to be talking about the phase zero software components how we're viewing it as a set of software bits So I'm going to be talking so phase zero means Casper proof of stake without shards or execution Adrian's going to be talking about value of privacy on the network and the differential Fuzzing work that's been going on in eth2 So I want to talk about validators as you know They are replacing the miners as the block producers So taking them offline will affect the liveness of the network and stop new blocks, which is very bad So I want to talk about the two failure modes of validators So we have these two different modes. This is offline validators and equivocating validators So an offline validator is a validator That's either just simply not connected to the internet of the network or they're on a different minority chain So from the view of the majority canonical chain whether or not they're just completely offline or whether they're on a different chain It's basically the same to us So these validators they stand to lose in rewards what they would have looked what they stand to lose in penalties What they would have gained in rewards So they're gonna start sort of leaking out at five to ten five to ten percent per year This is in the case where there's only just a handful of validators if there's enough validators to prevent finality The system swaps into this mode where it starts to quickly eject validators But this is just sort of like general individual case losses So then the other hand we have equivocating validators So these are validators that have produced conflicting messages So the easiest one is you know two blocks at the same height that are different. That's a double vote They've violated the protocol. They're gonna lose a lot of state quickly They didn't get slashed and they're gonna get ejected from the validator set So these two modes have two interesting properties The first is that taking a valid or offline is generally an unprivileged attack So if you know a bug that causes a node to expend its resources and do it in our service or Or perhaps crash then you can generally do this by just connecting to it on the on the internet And then sending in a package and watching packet and watching it go down whilst on the other hand Making a valid or equivocate should if the if the software stack is good Require elevated privileges on the host machine. So it requires someone to have access to the machine Which is running the software of the validator The reason in being is that if a validator can keep a history of its previously signed messages It can check these messages and detect if there's a quick an equivocation before it signs a new one So if you imagine the equivocation detection and the signing is just like one black box There should be no messages that you do or don't send to it that can cause it to equivocate without it being aware of it beforehand So now I want to move on to the software components This is how the researchers and implementers have been thinking about this system in two two distinct components One is the beacon node. The other is the validator client. So the validator client. Oh, sorry The beacon node is what we know as geth. This is the big piece of software connects the peer-to-peer network It gets peers imports blocks does state transitions verify signatures, but importantly it doesn't sign messages That is the job of the validator client. So the validator client It connects to the beacon node and uses it as a source of truth So he doesn't need to connect to the internet directly it connects via the beacon node And then it requests blocks its duties from the from the beacon node Signs them and then returns them. So this is what we have this relationship Where a bad beacon node can cause a validator client to be offline by either sending it nothing or sending a garbage But it shouldn't be able to if there's like a sovereign well maintained validator client The beacon node should not be able to make it be slashed So moving on this is a bit of a networking overview of the two components here here I have these two binaries here that are perhaps in the wrong Linux folders, but we'll ignore that They don't necessarily have to be Two distinct binaries they could be the same thing, but this pattern is what we're saying from the majority of clients and ours included I'll sort of show you why and the next slide But in the terms of networks we have the internet over here, which is the untrusted zone So this is where we have Anybody internet where we're connecting to peers via discovery v5 using UDP and then we're talking to those nodes using a libp to Generally over TCP. So this is where people can start to you know do denial of service attacks And this is where we need to be cautious of this or on the other hand We have a private network over here So this is a LAN or a VPN something we trust and this is where the validator client connects to the beacon node By an API I have said TCP here, but it could be anything So the reason that we have this private network over here is to Isolate the validator client from the internet and reduces the tax service and the people that can get at it And I just been mentioned here so the two databases here There's the beacon node has this is like the big like kind of get style level to be a rock Stevie database full of the chain and the validator client will keep its own database Where it maintains its history of messages enough so it can't get slashed and perhaps some private keys either on disk or or on a trezzel So this is what I was mentioning before about the reason that we've separated the validator client out It's because it can talk to multiple beacon nodes So we may have the case typical pattern would be validator client connects to one of them uses it as the source of truth of the chain and Perhaps it'll when it produces a block or attestation It may send it to a few of them just to ensure that the that the block gets propagated out to the network And the validator client can also maintain a sense of quality of service of its beacon node So if the beacon node is not responding or or it's not it's perhaps referencing some other node to see whether It's blocks are appearing in the chain if it decides the quality of service is not good It can just hop over to another beacon node and because it's maintained a side base of signed messages It may be offline during that hop if they're on different chains, but it shouldn't be able to be slashed So now it's my last slide. I just want to go through this is the inside of a beacon node So the big binary this is kind of like a very simple overview from a security only perspective of what's going on So this is where messages may come in here on the left From the internet and end up in our database or perhaps we're doing a response pulling something from the database and sending it back to the internet So the first thing that we see from the internet is a networking stack. That's libp2p So we're being cautious here of eclipse attacks where we you know surround the validator and give it a minority view of the network alternatively We have to be careful here with an alternative But we have to be careful here of validator privacy Adrian will talk about this And we're also wary of resource exhaustion here if we can make libp2p do too much work and slow us down We have marshaling where we're encoding and decoding bytes from the network. So we have a new Encoding and decoding scheme that simple serialized SS said it is quite simple and straightforward But it is new and it uses pointers. They're called offsets So here we need to be quite careful of seg faults and all the interesting things that come with encoding formats and Perhaps on the higher level here We have a consensus message that we may intend to import to our database or in the lower level here We're validating requests that we may give back to response to someone. So the consensus message validation is Is a very complex piece of software. So this is the the implementation of the f2.0 specs repository. So this The f2.0 specs is written optimized for readability, but not for speed of execution So one of the really important things that client developers must do is produce optimized versions of this specification. So We need to be careful that these optimizations don't cause consensus forks, you know, given the same block in state We produce the same post state. So we're paying particular attention here through fuzzing which Adrian will talk about We're also very cautious of resource exhaustion here. Perhaps someone sends a block that's valid But takes a long time to put to process. We want to we want to really avoid this Something else we want to be particularly careful of is blocks where it takes us a long time to determine Whether or not it came from a valid producer. So sometimes we have to do work to figure out who should have produced the block So we need to be particularly careful here because if people can make us do work to figure out who should have sent it That means anyone can send it to us. So these are attacks that are open to anyone Not just people that are inside the validator set. So this is something we've been actively working on lately There's a lot of arithmetic in here additional subtraction Division that can you know divide by zero or we can have underflows overflows We're also doing a lot of access to arrays in here where we're rewarding people So this is where we need to be very careful of seg faults, too And then finally here we have the request validation. So this is where we've aetherium two has a new Networking protocol. So during the design We're being quite cautious to make sure that we don't allow for requests that may take An unreasonable amount of time in order to respond to them or process them So this is the end of my slides and we're going to be hearing from Adrian about validator privacy and differential fuzzing Yeah, so for the rest of this talk, I just wanted to give a very brief overview of two specific examples of Some a theorem some security considerations in a theorem to so specifically validator privacy and differential fuzzing So for validator privacy, what are validators when you know what these guys are first, which I'm sure you all know but anyway validators are Ethereum two entities that drive consensus via staking on a theorem and performing specific tasks in particular They need to vote on shards and beacon blocks And at the network layer, they need to produce blocks and they need to produce our testations So that means if you want to be a validator You need to have a beacon node that's connected to the network that actively produces blocks produces our testations So I want to try and kind of give you a high-level overview of what the actual issue we have with validator privacy in a theorem to we're using a protocol to publish and Propagate blocks and attestations on the network called gossip sub. There's protocol labs I think in about an hour or giving a deep dive into this particular protocol But right now I just want to give it like a very high-level overview of roughly how it works And and why there's an issue or a potential issue or security consideration inside of that So this is a simple peer-to-peer network. It's what I'm trying to What I'm trying to show here in this diagram each of the circles are beacon nodes The direct lines or the solid lines between those are direct connections physical connections IP IP TCP connections if you want between the nodes and you can consider this as a very rough Ethereum to peer-to-peer network So of all the nodes in the network, there's a subset of nodes which are validator nodes What I mean by a validator node is a beacon node that has a validator client attached to it And they're the ones that are producing blocks or attestations when we publish a block or an attestation on the network, let's consider this Validator node in the middle when we want when he wants to he or she wants to produce a block on the network It uses gossip sub the way that this works to publish a block is you select a subset of your connected peers And you send them the block and in turn they do the same thing in turn They do the same thing and the block or attestation gets Propagated across the network such that the entire network receives the block Okay, so what's the issue with this the issue is that the first people you initially connect to what happens if one of them is Malicious they have a direct connection to you So they know what your IP address is if you're a validator client on the network in practice A malicious actor was actually going to release or or deploy a whole range of these nodes and through timing analysis They can actually work out who are the validators specifically their IP addresses on the network So assuming that you can perform this attack you can Essentially collect the IP addresses or the physical addresses of all the validators on the network So why is this an issue? So it's an issue for a number of reasons the first one is that you if you know a specific validators IP address You can target that computer and you can DOS it you can perform an eclipse attack for example Which segregates it from the network a validator that's segregated from the network can't perform its tasks that's required to do and therefore it's Validate they will lose their stake So what that means in practice if someone comes along and says that you look fat in a pair of jeans you go right Oh, I'm gonna find your IP address you segregate them from the network and they lose their stake as punishment a more A more severe attack and sophisticated attack is if you if you know all the validators IP addresses on the network You perform a large-scale attack on a majority of those validators and you kick a majority of validators off the network So they cannot perform that their tasks If you do that you can prevent or delay chain finalization because the majority of validators aren't finalizing or agreeing on blocks You can amplify validator loss because if a large number of validators are disconnected from the network at the same time It amplifies the amount of stake that they lose and you can increase your own rewards by Removing competing validator, so it gives you some incentive to perform these attacks So this is an active research topic in Ethereum 2 at the moment So we don't have a specific solution is what we're actually going to use but there have been some proposed solutions So some ideas are that we can run a set of backup DOS hardened nodes So what that means is that if you are running? Validator client with a beaker note in you Publishing blocks and attestations and you find that they're not being received on the network You can point your validator client to one of these hardened backup DOS hardened beacon nodes or beacons Yeah beacon nodes, which will then propagate it for you Another solution is that for all validator nodes you could probably put them behind Tor I2P or mixed nets which are network layer kind of Infrastructures which which marks your IP address, but there's there's some latency issues with this There has been some analysis about how bad this is Handle is an access station aggregation strategy, which has been proposed by Pegasus from consensus They're also dealing with this problem. So if you're interested in this kind of thing I suggest having a look through Some of their research papers. I'll have some references afterwards Another interesting idea is adding dandelion inside gossip sub. So again gossip sub is our message propagation system dandelion is an anonymization framework for anonymizing Bitcoin transactions on the Bitcoin network And I just want to give a rough Overview of how that might work in our in our gossip sub network or what is currently being suggested So with dandelion, there's an initial phase So instead of us when we want to produce a block or publish a block across the network We don't use gossip sub immediately We route it through a few peers first and then those peers on our behalf will Propagate it via gossip sub and this is a stochastic process So there's a random number of nodes that it gets routed through so as a very Rough overview of how this works Let's consider the same validator know that wants to produce this that publish the same block if we have dandelion in it the first and Anonymizing phase is we route it through Two adjacent peers we choose them each of those peers then For example flip a coin and they flip a coin as to whether they are going to Propagate it by a gossip sub on our behalf on or forward it on to another block So the first peer let's say the one at the bottom He's flipped a coin and he is going to propagate it across the network the top one flips the coin and finds out I'm not actually going to propagate it. I'm going to Pass the block on to somebody else to then propagate it via gossip sub So those two peers then do the same coin flip and they find out that they're also going to propagate via gossip sub So we now have three nodes that are going to propagate via gossip sub which just does the exact same Process as what we did earlier and propagates across the network So in this example or in this scenario We find that the blue peers are the ones that are the first ones that have received it via the gossip sub messages and to from their Perspective it looks like that these three nodes that are started at other source nodes for each of the gray nodes that Receive the message they're unsure about which the source was because the packet that I received Could have either been the source or just another peer that has just flipped the coin and is propagated on behalf of somebody else So this is the rough idea of anonymizing the Validator client source inside of gossip sub Again, this is an active research topic that we haven't decided on what we're doing there's I Encourage anybody that's interested in in this kind of a problem to look at the online discussions that we're having at the moment Or to contact us afterwards or just have a chat because there's some interesting problems that we should we should address here So here are some references to have a look at further details as I only have time to just very Hand-wavy kind of go over this So secondly, I want to talk about differential fuzzing differential fuzzing is a project that Sigma Prime has recently adopted in an attempt to Security harden all Ethereum clients before mainnet So let me explain what differential fuzzing is by firstly explaining what fuzzing is Fuzzing is a security analysis tool used to find abnormal behavior in software So if we consider an arbitrary program for example a rust SS ed crate You can think of this as just an arbitrary function that has a set of inputs and a set of outputs a fuzzer Generates arbitrary data puts it in as inputs and and verifies that the outputs are what you expect Or if you have something explode then you know something's wrong with your code If you actually have the source code You can do something called guided fuzzing which allows the fuzzer to instrument the source code And mutate the input functions such that it maximizes the code execution paths of your function So you can maximally check that your Code does what it expects and you try and find these bugs. So how does this relate to aetherium 2? In aetherium 2, there's a whole range of clients over a variety of different programming languages Each of the clients have implemented the consensus logic their own way and if There so there is a chance that any of these clients Given a set of inputs could produce an output that is different to another client And this is bad because if we run this on a network And a malicious actor knew what that input was and it could segregate one of the clients from the rest All users that are using that client on the network could in principle have a consensus fork across the network Which is something we really really do not want so the idea is we want to before we hit main net is to ensure that All of these clients that are that have implemented the aetherium 2 consensus logic Implement it in such a way that all of the inputs Pretty much every input we can possibly give it produce the same output amongst all the clients So we don't get these forking amongst the clients. So Specifically the each of the clients implement a whole range of functions on the network level and consensus level and everywhere But there's the main core consensus things we really want to test the state transition functions There's a whole heap of functions involved in this But you can kind of summarize these into three main functions the state processing block processing and epoch processing So these are the main ones we want to target to ensure that all the clients conform to So the way that this is done in a single slide very high-level overview is we have a piece of software Which we call a differential fuzzer it has plugins that essentially we allows us to plug in each of the clients more specifically their state transition functions and The fuzzer itself will Similar to an ordinary fuzzer generate a whole heap of random inputs designed for these state transition functions Send them individually to each of the clients get a response and make sure all the clients Agree on the response if there's a difference then we know that there's a difference between the implementations in the clients And we we're gonna check this this is something that we are currently actively developing We still need to onboard a number of clients and again anyone that's interested I encourage you to come and have a chat with us after it So I only had enough time to really just very highly go over some security Considerations that we have to give you a rough idea of some of the things that are going on But this is the end of our talk So if you're interested in any of these topics, I encourage you to come and contact Paul or I or anyone around. Thank you