 is Everd Buzzy, I'm a researcher and a product marker at the Census and then we go for being here. I'm Mali Anderson and I'm a writer at the Census. So over the last several months, we've been trying to figure out ways to measure and define the decentralization of the Ethereum network. Generally, the most popular comparative metric for the import of the false is throughput. Most of us know that the more decentralized and secure blockchain is, the lower the throughput, at least at this stage in the tech's evolution. So transactions per second are not a very effective measurement. So what do we actually talk about when we talk about decentralization? How can we objectively measure its extent and to monitor its evolution over time? Obviously decentralization isn't a binary, is or isn't a condition, but a very complex and emergent process that will change the network growth. What data can we measure objectively? Originally, we set out to define a quotient or some standard measurement that can replace throughput as a comparative data point. But we quickly realized that we get into Apple's oranges pretty fast when we try to compare most of these metrics across protocols. So we never focus a bit to say, what can we objectively measure Ethereum right now and watch chamber for time? What's actually happening on main net and what does it tell us about the progress that we're making or not making? So here are the questions we set out to answer. Is Ethereum actually getting more decentralized over time? Are there metrics to the network getting more centralized? Does the data reveal areas we need to focus on addressing or changing? Given some of the trends we're observing, can we make any meaningful predictions about the future and which of these metrics can be compared across protocols? So our approach to measuring Ethereum decentralization over time began with determining which elements of Ethereum's architecture, both on and off chain, most significantly impact base decentralization. So to start off with at this age, the research identified 19 key subsystems spread across four different categories to investigate. We attempted to anchor our conclusions in as much on chain and objective data as possible, which can get tricky. And it's also important to note that there are data points that we have not covered that we wish we could. We consider them important, but not necessarily on chain and they can be very difficult to quantify. Those include concepts like the strength and distribution of power grids on which nodes run and illegal jurisdictions and the relative stability of countries on which large numbers of nodes are hosted. So for as many of these data points as possible, we track their evolution quarter over quarter as far back as possible. Essentially from the earliest days of Ethereum, through gradual adoption, rampant speculation, hacks, crypto-kitties, the bubble, and sort of the subsequent course correction that we've seen in the past year or so. So we measured and charted about 20 or so different metrics. Obviously we don't have time to run through all those today. So we chose a few of them spread across these different categories that we're going to talk about for a bit. So looking at account growth, we looked at, started off looking at total account growth over time compared to active account growth over time. So on the x-axis is quarter over quarter across the past four years for beyond most of the charts that we've seen. And then the blue line is total address growth over time and the red line is active addresses, which we define as addresses out of transactive response. And that spike in Q2 is the Shanghai tech during the Debcon 2. So we see a compromise increasing steadily over time, but we see active addresses more or less starting to flatten after the bubble. So as it was achieved to note, but this is the growing delta between the two lines over time, which likely is due to the increase in smart contract activity on the network rather than just simple PDP value transfers. Here's another account growth graph. This one shows account growth over time by total addresses in gray versus addresses with a non-zero ink balance and orange. So what we see is a fairly steady linear increase in non-zero addresses on the network with no major bumps, even during the dramatic price fluctuations. We can't necessarily say this means there's been a steady increase in the number of individuals we need because the addresses are subdominous. But what we should consider again is the growing delta between the new addresses and the new non-zero addresses that is getting steadily wider as more smart contracts emerge since smart contract addresses have no leaf balance. We're seeing a pretty clear increase in both non-zero addresses and smart contract addresses over time. The evolution could indicate that the network is supporting more diverse and thus more decentralized types of business logic executed on Anet. This following graph illustrates e-thownership of the top 10 100 and 1000 addresses compared to total supply over time. So the bottom and red are the top 10 and then yellow are top 11 to 100 and then top 101 to 1000 and then all the way up to 100%. So the story that this tells is fairly obvious. We see the top 10 and 100 addresses on the Ethereum network on any steadily lower percentage of the total heap holding that could just be likely due to the passive result of increasing supply over time. We do see some of these larger accounts possibly being pushed down into lower tiers of ownership while others move up in their place which also could account for the recent uptick we see in the percentage of Ethereum by the top 1000 addresses in the past few quarters. When we're looking at the trend of ownership, general trend of ownership since 2015, we see that e-thownership is becoming more diverse across addresses. We can't assume that more new addresses means more new unique individuals participating in the network. However, we can see that the number of non-zero addresses are increasing and the concentration of the top 10 and 100 are decreasing alongside each other over time. This could possibly suggest that contrary to the popular narrative, the crypto bubble was not overwhelmingly followed by whales and holders buying back crypto at those lows just to make a buck off of what would be the eventual market uptick. Rather, this negative correlation could suggest that more and more new people began accumulating at a steady rate after the bubble which alongside growing each circulation could have been reducing the percentage of concentration of these top 10 and 100 whale holders. So the ecosystem being as young as it is, the unequal concentration of wealth isn't necessarily a major red fire for decentralization in the long term even with the still large concentration of wealth on the fine whales. Ethereum is still pretty far ahead compared to other blockchain protocols at least on this metric. Looking ahead, however, ETH concentration in the hands of a few becomes more of a concern when we shift to group of state and influence on the network becomes more closely correlated with ETH ownership. As the beacon chain gets more functional and as POS replaces proof of work, it'll be important to watch out for the staking power concentrating in the hands of a few. So this graph shows the total circulating volume of ETH compared to the circulating volume of select ERC-20 tokens which are shown as token value in ETH. So this green line shows the total amount of circulating ETH, i.e. ETH moving between addresses quarter to quarter. It's essentially correlated with the price of ETH with the spike in circulating ETH aligning with the price high in late 2017 and early 2018. And then the bar chart shows the value of a select few ERC-20 tokens circulating quarter to quarter. We looked at the top 10 by market cap, plus a few interesting ones we wanted to look at like die, zero x, medic, and loom. The purpose of this graph is to see effective ETH network is getting more diverse from both utility and the speculation perspective. What we think it shows is that despite a relatively stagnant ETH price recently, the ETH value in circulating tokens is increasing dramatically. And not only is the circulating value increasing with the diversity and market share of tokens are increasing too, suggesting that users are using more ERC-20 tokens and doing more with them across the board. These two graphs, one on the top one on the bottom, shows the growing concentration of mining pools over time as measured by a percentage of total block production on the top and then a percentage of the total miners are being paid out on the bottom. And then in each of these graphs, each color corresponds to the same mining pool. So for instance, the green that's on the bottom of each graph is ETH mine and then so on to the rest. So over time, we see that four pools have really started to dominate the mining pool landscape. Those are ETH mine, F2 pool, Spark pool, and Nano pool. Collectively, they've edged out past previous competitors like Mine and Pool Hub and Dwarf pool, which are, you can't even see the percentage isn't there, barely see their importance in the past few quarters. So those four pools, those four major pools now account for over 72% of quarterly block production, again, the top graph, and they pay out to about 83% of the miners across mining pools, again, that's the bottom graph. So in particular, we're seeing a particularly concerning, possibly concerning dominance in block production between ETH mine and Spark pool, which today account for just under 50% of blocks produced per quarter. And together, we're seeing ETH mine and Nano pool pay out to nearly 70% of the miners on chain. So the concentration of the influence among a few mining pools is definitely not ideal, but it's not necessarily a major concern by itself. Miners are pool agnostic, they will migrate to whatever pool offers the best incentives. And if we assume rational behavior by miners, if a single pool were to reach a hatch rate close to 50%, or start physically colluding with other pools to amount to 51% attack, it's not outside of our own possibility to presume that miners would abandon those pools to protect their income and switch to other pools. However, when we look at the number of mining pools and miners over time as shown by this graph with mining pools in the orange and miners in red, we see a distinct decline in both over the past year. In short, what this means is that we have fewer miners that are active on fewer mining pools and fewer mining pools being responsible for network maintenance. As a quick note, it's important to say that the number of miners in red is not entirely accurate. We base the number of miners in mining pools based off of the on chain payout addresses, but some mining pools pay out their miners off chain just through more traditional bank deposits. These numbers will be a bit off, but we can't tell those off chain numbers. So overall mining pools could be an area of increasing centralization on Ethereum. We have lower ETH prices, reduced block rewards, and a fairly stagnant hatch rate that could be the reason for miners not joining the mining pools recently, and then also we just have a loss of efficiency concentrating the power of mining pools in fewer and fewer mining pools. So with NIE towards the future, the eventual switch to proof of stake will redefine this metric of centralization, but until then it's fairly prudent that we keep an eye on how it's concentration of mining pools and miners and the relative power evolve over time. So next one, we'll look at no diversity. So this data is from No Tracker on Etherstand, and it shows a no count by country starting in 2018, which is unfortunately only as far back as the data goes. It's a heat map, so the warm colors are the highest no counts and the cooler colors are the lower no counts. And in general, we definitely see some unfortunately bare areas that stay pretty consistent over time, such as the African continent, but we do see pretty uniform fluctuations rather than random sudden spikes in different geographical areas that are unequal from other countries. And it's also helpful to see that the network is physically quite distributed and decentralized across the globe and across a variety of legal and political systems. So these two graphs look at the node size versus the price of the leaf. There's been a lot of fluctuation in the total countable nodes running the Ethereum network. Right now, for example, I think the number is about half of what it was around this time last year. So on the surface, that looks a lot like centralization. There are fewer nodes overall, and presumably fewer people running nodes. There are a lot of reasons why that could be, but these graphs show two factors we looked at to see if there's a correlation. So on the left, we're comparing the number of active nodes to the average heat price for the recorder. Actually, I think this is week over week. So maybe when the price is high, it's a common belief that more people think it's worth it to run nodes. That doesn't actually seem to be the case, over the course of 2019, which is, again, all the data we were able to get. When we look at the node count in the same graph in blue as the heat price in green, it looks like there's actually a new correlation, that a node count was one of its lowest points in June, one of the prices the highest, and the count was quite high during the price tip at the end of the summer. So even if this correlation might have been true historically, and that's why people think it exists, it doesn't seem to be true right now. So then on the right, we checked the node count against the total size of the default node on death, which is the red line, and in parity, which is the orange line. About 97% of all the nodes on the network are running one of those two clients, and about 95% of the nodes on the network are default nodes, and opposed to archived nodes, which were, of course, a much higher date for them. Obviously, the average node size is more or less the same phase number of times, as a box or mind. It seems reasonable to assume that as the default node size gets bigger, it gets more expensive and takes more energy to keep the node running and keep it synced, so maybe sure people are doing so. And that looks like it's the case when you look at the graph on the right, or at least we see a clearer relationship than there is with the price of these. As we move toward KOS and charting the network, the node size burden will have to be as much of an issue, so maybe this attrition won't continue on the same trend into next year. There are also some interesting experiments happening around the ecosystem to find ways to make running a node easier and cheaper, so we wanna keep tracking this one through the changes that the next year is gonna bring on the network. So we set out to start measuring potentialization. We, of course, envision this fabulous graph or tracker that would compare pretty consistent decentralization metrics across protocols that very quickly, as Madis said in the beginning, sort of devolved into comparing apples and oranges. And we, of course, recognize that based off different consensus algorithms, you can't compare the same metrics across protocols. And so we took that original 19 subsystems that we showed, we said, okay, which ones could be possibly and confidently compare across at least a few protocols? And, you know, for instance, a proof of authority, or sort of, to have a VOA blockchain which is very different than it would for approved work, or for state blockchain. And so the next step would be to sort of take this adjusted series of subsystems and see if we can confidently compare them across, across chains. The difficulty is, of course, act as some data for those chains. So what conclusions can we take away from all this? First, we think the network is most clearly decentralizing when it comes to the reduction in the holding power of the top 10 and 100 addresses. As we head toward Ethereum 2.0 and group of state watching what happens with these ownership percentages will be crucial. Second, we think that the clearest area of centralization is in the mining pools and the concentration of power between handful of pools. We also suspect that no nutrition is a big one, but we just don't have enough historical data to predict what we're going to do later. As a third conclusion, we're not, as a larger point, we're not necessarily trying to make a decision about if the ecosystem is doing a good or bad job of decentralizing Ethereum or even making any value judgment. It's obvious that the activity on Ethereum is getting more diverse. That developer mind-sharing activity is growing. They're making steady progress on security and the introduction of use cases like DeFi have injected a lot of factors into decentralization and activity that we don't necessarily have time to analyze yet. So it's safe to argue that Ethereum is far ahead of most of the protocols in terms of this network and ecosystem of decentralization. Fourth conclusion is that over the next, over the lifetime of the network we've seen much greater complexity and more layers of activity happening off-chain which we can't see here. That will just get more true over time. So in some sense, this stuff might get even harder to track but that also makes the security and decentralization of the base settlement layer that much more important to watch and make that. We also wanted to point out that it's really important to have this historical data about the network so we can all watch these metrics continue to evolve. And I would say Ethereum is much better tracking data than any other protocol but it can still be surprisingly hard to find let alone make sense of this very zoomed out big picture stuff. So Etherscan shared some of their non-public historical data tracker data with us which we really appreciate and we also want to say thanks to our colleagues at the Letio, especially Danny Sweet and Moomoraki for helping us pull all this data and create these regulations. So as you mentioned, we didn't have time to go over a lot of the other graphs that we had pulled and analyzed for the study of decentralization. So we have about a dozen or so more that didn't have time to look but we're going to build a page on Letio which is a data analytics platform where anyone interested can view those graphs and then hopefully in the near future we'll have a lot of tracker of these metrics over time so we don't have to constantly update them and reanalyze them. So it's just a few of the other graphs we don't have time to analyze including function calls on the top left that small one does include transfer and on the bottom right just remove transfer to get a clearer view of the other function calls. Look at ETH training volume over DEXs and then gas fees and then just a different way of looking at the top address concentration. So we really hope this is something of interest to the network that people continue asking questions about it, continue exploring it, and I suggest some additional metrics or ways to quantify those subsystems over time. And if you have any questions, please definitely reach out to us and enjoy the rest of DEV.com.