 We'll move on now to our next speaker who will be joining us virtually, Edward Chi. Edward's a data scientist at Starboard where he's part of the network analytics team that designs products and pipelines to support network governance in the Filecoin ecosystem. He's passionate about geometry and number theory, but today he'll be talking to us about Filecoin analytics for network governance. Take it away, Edward. Thank you, thank you for the introduction. Hello, everyone, this is Edward from Starboard. Today I'm going to give a talk about Filecoin analytics for network governance. All right, so let me begin my talk by making a simple observation that Filecoin has actually become the most vibrant data economy in the Web3 space with a committed storage capacity over 15 EIB and active like amount of client storage deals over 70 PIB. Now, one of the key issues arising from such being such a huge complex adaptive system is that how do we properly govern, right? Govern such a huge complex adaptive system, especially with the case that, you know, we're adopting a community-driven democratized sort of governance framework here. So like there is, in some way, there's really a strong need for an accurate, comprehensive and open-sourced data analytics infrastructure upon which collective decision-making, real-time collective decision-making can be made. So in particular, right, we as a team, we've identified a few key challenges intrinsic not only to the Filecoin governance framework, but also to Web3 governance in general. And we believe that, you know, these common challenges for those data analytics is precisely the key solution to addressing those challenges. So let me go through them one by one. Firstly, observe that each Web3 network is essentially an island economy that involves decision-making from individuals and organizations with essentially different preferences, goals, and horizons. Therefore, what this means for the governance team is that they really need to understand how to balance the trade-offs of different decisions. Given that agents in the environment basically have different preferences, goals, and horizons. For instance, right, say the governance team wants to introduce a Filecoin improvement proposal, we commonly know as FIP, you know, they need to have a very clear understanding like who's benefits, who's expenses, you know, how will this particular FIP introduction impact the entire network, not only from aggregate, but also from actor-specific level. Secondly, you know, despite everything is publicly available on chain, public information is actually not properly diffused. This is a key problem. So if I ask you a question like, what is the current state of the network? What are some population statistics? You know, what are some of the micro trends and issues existing in the network? We need expertise or we need a middle layer to actually extract that kind of on-chain data insights from the blockchain architecture and make them available to the general public. Third key challenge that's observation is that, you know, governance doesn't have to be emotional and political. The key really is to have an accurate and robust data analytics upon which the community can make informed decisions for their shared goals and values. All right, so what I just talked about is basically establishing that network analytics is important for governance. But how do we proceed, right? How do we proceed, go from the first principles into actually building data pipelines that should basically make the insights into product? Well, we start from the following two initial observations specific to the Filecoin and we actually derive some systematic strategies from these two observations. Firstly, notice that similar to Web2 data solutions, Filecoin is a data economy consisting of a globally distributed network of servers and clients. This implies that like Web2 services like AWS or Azure, we need to track operational intelligences traditional to the Web2 data service provisions. Things like storage uptime, reliability metrics, which is also tied to the fantastic talk Tom just gave. We need those like metrics, right? To actually properly evaluate the effectiveness of the Web2, like actually the storage provision service. We also need a very clear KYC, things like, you know, where the top, you know, addresses that are making the most interactions on the network. Now, on the other level, right? This is a second observation. Unlike Web2 solutions, Filecoin is the words data center. What does that mean is that essentially it's a decentralized layer one protocol, right? Basically trying to democratize entire storage provision data centers or data economy services to everybody participants on the network. This means that we also need to track intelligence or Intel specific to the blockchain analytics. Stuff metrics like circulating supply, gas, block rewards, right? Those are the concepts or metrics that this is basically native to the blockchain ecosystem that are not traditionally available in Web2 basically services. Right, so now we've actually move on from the observations. Here's our roadmap. So derived from that two observations. So here's what we did. We thought it was a good roadmap for making everything happening. Firstly, we need to understand the Filecoin's protocol structure and basically build expertise in analyzing the state synchronized data, right? So basically we need to, you know, call some RPCs and grab a data out, build databases. Next, we're going to map the on-chain data to the corresponding protocol features and design specifications, right? Those are available usually in the source code or open source in the SPAC. We're going to map on-chain data to different parts of specific features or protocol. Then we're going to translate the know-house into actionable insights, charts or statistics that actually support decision-making for relevant stakeholders in the ecosystem. Last point is the most important, right? After we did the first three steps, we need to build a comprehensive data analytics platform that provides intelligence across different angles, across all of different fidelity. So with the principles and methods and roadmap discussed above, very probably we introduced the network house dashboard. So this is our factory product that usually, you know, basically try to analyze views, like analyze the network's health state through different angles. All of the things I've just talked about, discuss the principles and methods, it's all either. So the best way, we're actually trying to provide a section-based user journey, where through the different sections, we provide a very clear way on how we think the governance team should approach analyzing the current state of network. And so the best way to summarize the product, this is a one-shop comprehensive intelligence platform that provides not only current, but also historical data analytics for the ecosystem, especially for the governance team. So let me just briefly give a tour, introduce each subsection and the information we capture. So we're looking right now at the first section, which is the capacity and service. This is the place where we track, you know, information related to storage provision on the network. We ask questions like, you know, for storage state, right? What is the active amount of storage currently existing in the network? How many sector commitment, storage capacity commitment has been sort of committed on a daily basis? We also track reliability indices, right? The stuff Tom just talked about, you know, if there are thoughts happening, right? How quickly are those thoughts getting recovered? What is the average fault time distribution overall on an aggregate level for the network? We also track service provision economics, which tells us about, you know, if I'm a storage provider, I want to, you know, put a storage into this network. You know, what is the rate of investment versus return? What are some of the current block rewards I'm going to like get if I, if not now I want to like basically commit. So moving on, we have the next section, which is the circuit in support, right? This is actually talks to you about the token flow. How many fill, let's say, for instance, for example, gets mined, how many fill gets vested, how many fill gets locked and burned on a daily basis? And you also, you can also do data drilling, right? That's the purpose of the entire process. You can understand, you can basically read about the component breakdowns for the aggregate statistics that you find interesting. For instance, you can have a detailed look into the different components of fill locked, the amount of fill get burned, all of those like detailed breakdowns are there according to the protocol specification. So moving on, we have storage demand and deals, right? This is the demand side of the equation. Here we track stuff like, you know, what is the deal inflow? What is the deal outflow current and historical? We also track stuff like, you know, how do we do KYC? Even in a relatively anonymous environment, you know, we want to identify the top adversities that are making the most deals within the ecosystem. Next is exciting topic, network usage and gas. We track, you know, what is the current base fee and also tracing back to the like past 24 hours, corresponding to the EID, equivalent to the EID 150-59, you know, what is, you know, we kind of see the base fee, right? We also want to understand, you know, if there is a, let's say, if there is a base fee spike, what is the gas usage that actually contributes basically can be attributed to that particular base fee spike. So this is where we put things like gas usage, breakdown by methods, telling you a volume comparison. We also track stuff like aggregation, right? Basically aggregation is a scalability technology, innovation the team has made, the PL5 going team has made for approval storage. You know, want to track, you know, once this innovative, fantastic technology is out, how often are people actually using this? Storage provider actually using this to reduce the amount transaction cost. So I've just talked about like a lot about the structure, the theory, the principles, let me just give a particular use case on how the sort of product like such can help with like Web3 governance. So this is a real life use case. So in the past few months, starting January, 2022, there has been various initiatives in the 5.1 ecosystem focusing on driving 5.1 plus the OSIB option. And basically increasing the network utility. For those of you who are not familiarized with the term, 5.1 plus deals are deals with a better KYC and tighter identity verification. So let's just imagine that if I'm a business analyst and I'm asked to write a report on the initiatives, starting trying to drive out the 5.1 plus adoptions, but at the same time, I'm asked to basically quantify the impact of the various strategies developed to drive the adoption. Here's how I can use the dashboard. Next page. So the first place I'm going to go to when I wake up in the morning is I'll go to the deal storage deal section in the dashboard. And I'll observe that this chart called newly committed deals. This actually tells you how many deals flow into the system on a daily basis. This actually taught me as a hypothetical analyst that over the last three months, 5.1 plus deals has actually taken over majority of the deal growth. And you can see by the amount of green area presented in this chart. And it's not actually 5.1 plus deals now accounts for over 70% of all of the active storage deals outstanding. So, I'm a curious analyst, right? I don't just want to stop there. I want to go to the other different parts of a protocol and basically build a multi-dimensional analysis. This is where I can go checking to the storage provision section and see that doing the past months or so, we actually also see some spike in sector onboarding activities, which tells us there is an event correlating with a higher deal adoption is that actually more storage providers are thinking about either extending or joining the ecosystem to provide more service provision coming in. Meanwhile, if I go into the gas and usage section and if I analyze closely the daily network fee breakdown, I'll also realize that another thing correlating to higher deal adoption is that the network is actually getting busy again. Judging from the amount of fees activities, which is a direct indicator signal to tell you how busy the network is. Next. So, if say our analyst, right, is not satisfied with the overall statistics or trends and wants to do a more careful KYC, well, I can just then track the actor highlight and basically identify the top agencies that have the highest amount of participation. So here on the screen, we're seeing quarter top 10 clients classified by verified dealbites. Also quarter top providers actually actively taking five point plus deals. Okay, next. So, if our analyst wants to further drill, right, wants really to get to the bottom of this, he or she can check basically the individual client page and identify the protocol specific behavior and patterns. This is actually really an incentive design, but what we're seeing here on the screen is a three stage, basically KYC funnel. This is called five point plus verification protocol in which there is a node tree who allocates, who sort of approves clients initiatives to actually, you know, store, verify storage deals in the system. So we thought about, you know, how can we capture the transaction patterns and turns out graphic based method is graph based method is a great way to basically concentrate our analysis on and basically build a local level transaction graph analysis for each of the clients existing in the system. So if our analyst wants to dig further, that's where he or she can look. All right, open source insights, you know, we're in web three, all of the insights are supposed to be open source, right? Remember I talked about, you know, not all of the public information are properly diffused, but now that we have done the middle layer working between, we can just build open source these things that for everybody to participate in ecosystem. So we're currently testing, we actually released a bunch of, but we have like very positive feedbacks from people from Masari. Basically we're open sourcing insights in terms of data field notebooks and charts onto this platform called observable HQ. So for everyone who wants to build their own analysis, they can just treat this as basically a dual analytics type of style of like API and just download the data to their own analysis. So exciting opportunities, we like, as I said, right? Fargoing is such a fantastic ecosystem. There are so many aspiring challenges, data science challenges, the engineering challenges, like cryptography challenges we can do, but some of the stuff more related to my line of work is we have tons of visualization modeling analysis just as Tom just presented, right? There's tons of problem in terms of intelligent sector reliability profiling engineering. We have the data available and just about how we can build the best model office. There is also questions related to computational game theory incentive design evaluations and evolutionary games. And also, so I just gave a quick snapshot that we can do a lot of transaction graph analysis on this sort of open public chain as well. So also stay tuned for 5.20 virtual machine, which I believe are one of the next speakers who are getting to more costs. Thank you, please. We're hiring, so you can visit us at our website and contact us and I just, you know, hopefully more people will join in a collaborative build a fantastic ecosystem. Thank you. Wonderful. Thanks for that, Edward, for that really. This is very exciting opportunities for anybody who's interested in data science or data visualization. Definitely that last slide about exciting opportunities, very true. We have time for questions to the speaker. So I love the dashboards that you guys set up and can you talk a little bit about how people are using it already? Like, are you seeing some good responses? Are people engaging it with it? Are they building on top of it? I'm just curious what kind of interest you're seeing from the community in these kinds of visualizations and insights? Yes, okay. So a couple of examples, right? Basically, we have seen basically people inside ecosystem, right? Mainly the governance team. I actually know like a few stakeholders on the governance side who check out dashboard on a daily basis. And there was a lot of stories how they began first observing a couple of changes in the charts they find interesting. And that ends up in an ecosystem wide discussion on, you know, what is the current thing? What is the top like strategies to entire system to focus? So one example is we've actually looked through our dashboard, one of our few stakeholders has actually realized, you know, while there's a lot of sector retirement or exploration kicking in, starting January and after they saw that particular signal change on the dashboard, they sort of very quickly responded to it and just, you know, start building more initiatives on five point plus deal adoption and also sector onboarding rate. This is a, I think is a perfect example of illustrating this. We also have external adoption of people outside ecosystems. For instance, Missouri, when they write the first stage five point analysis report, they were using our like observable SQAPI to build their own analysis, yeah. Any other questions? While you're thinking then I have one actually about consumption patterns of your data. You mentioned that you have both current and historical data available for the Falcon ecosystem. Do you have a sense of how long data stays fresh are people still making a lot of requests for data from near mainnet launch or is it mostly closer in time? That's actually a fantastic question. You know, right now the main, there's like a year and a half, right? Things are being like nonch and we've actually seen a request from stakeholders, let's keep everything historical. Now, as the ecosystem develops, right? As a network develops, there will be the amount of, the amount of like data, the data size, right? Plus duration is going to be a challenge for us. So we've actually been discussing internally how our engineers we can, you know, basically design a good UX experience for people. You know, for instance, right? When you guys first arrive, you will see like a three months or six months snapshot of the charts you'd like to see. But there are also options on the UX side that supports, you know, if you want to see the full historical we can also show that. Yeah, but so far, given the short time from everybody's like, let's just do historical. Great, thanks again, Edward. Any other questions to the speaker before we move on?