 Thank you. Thank you for the introduction So this talk is on type proof of space and I'll explain what those are So first a proof of space is an alternative to proof of work where the resource being used as space instead of computational Energy so it should require the prover Who's producing this proof to use Significant space and has all the same applications as proof of work including applications to spam prevention or mitigating denial of service attacks and most famously to civil resistance in consensus networks like Bitcoin the advantage of a proof of space is that it Spaces a reusable resource so producing a new proof of space can reuse the same space Unlike a proof of work which requires consuming new energy So it's proposed as a more eco-friendly way of getting a system like Bitcoin So what is a proof of space look like it's an interactive protocol between a prover and a verifier where the prover is Allowed to choose the amount of space that it commits to store it sends a commitment to the verifier The verifier sends a challenge and the prover produces a proof that should convince the verifier that it is using almost all the storage that it claims to be using As efficiency requirements though you require the proof and the commitment to be much smaller than the storage that the prover claims to have So the overall communication should be much smaller than the storage and The verifier should also run efficiently So if N say is the amount of storage the prover claims to be using then the verifier should run in polylog and time There's also a non interactive Version of proof of space or any proof of space protocol since by definition Is public coin the fiat chemo heuristic can be applied in order to Get a non interactive version where the challenge is derived as a hash of the prover's commitment So in the literature on proof of space there's a differentiation though between two types of proof of space The one which I just described may only convince the verifier that the prover is using a lot of space in order to produce the proof But it may not be required to persistently use space over time so a proof of persistent space is one where the prover demonstrates that it continuously uses space over time and This is divided into two phases one which is an initialization protocol that looks much like the one I just described it can be either interactive or non interactive and then the prover enters an online phase where it fields challenges from the verifier and Computes responses that should convince the verifier that it is still storing a a lot of space But rather than it continuously rerunning the initialization the online phase is seen as a separate phase because it Can be much more efficient for the for the prover to do the initialization phase could require the prover to do a lot of work Now notice something Regarding you know the efficiency of initialization versus the online phase if the prover could just rerun the initialization to respond to the online challenges then Clearly this wouldn't work as a proof of persistent space the prover could delete its space after each challenge and then redo the initialization in order to respond to challenges so in the way that we define this is by restricting the time that the prover has to respond to challenges and in particular the Time that the prover has to respond to these challenges Should be much shorter than the time it takes for it to run the initialization So how tight is a proof of space? That's the principal question that this paper explores In other words, how much space can some adversarial prover save and still pass the protocol? could it pass the protocol with only one minus epsilon gigabytes if it claims to be using a gigabyte and For a tight proof of space the answer should be no for one minus epsilon as close to one as possible in other words A smaller epsilon gives you a tighter proof of space So we can define epsilon tightness as saying that if the online trooper stores less than one minus epsilon gigabytes Then it should fail with overwhelming probability to respond within some time limit t T is typically set to be proportional to the storage size a weaker way of defining Proof of space security would allow for a time-space trade-off in this definition even a parallel secure parallel prover who uses a lot of Parallelism in order to speed up Its response time to get it smaller than t should fail Whereas we could allow for a time-space trade-off where the prover should simply have to do a lot of computational work if it chooses not to use that much space persistently so in in in this work I define a tight proof of space as One which can be tuned to be arbitrarily tight So the protocol parameters could be tuned for any epsilon less than one so that it is epsilon tight as just defined the problem is that While tuning the parameters to to be a tighter proof of space for smaller epsilon that may also impact the efficiency And so we'd like to maintain efficiency and so the efficiency goal is to keep the proof size or communication and The increase in computation proportional to one over epsilon as epsilon gets smaller Intuitively the reason why it seems that one over epsilon is sort of the best we can do is that the the way that at least the information theoretic versions of the online challenges work the the prover is has to store some string of large size and the And if it for if it forgets too much of that string then the verifiers challenges will sort of catch it on On the errors that it made or the parts of the space that it forgot So if it only for if it only forgot one percent of the space Then the probability that the verifiers challenges will locate will be in that area of space You know is only good enough if the number of challenges is close to one over is you know order of one over epsilon Or lambda over epsilon so in this work we get epsilon tightness with Proof size which is seemingly close to optimal or order of log in over epsilon So what was the state of the art before this work will proofs of space were first introduced in 2015 and That construction had one minus epsilon less than one over one over five hundred twelve which Would mean that the there could be an adversary who uses only a five twelve fraction of the data Now note that these numbers don't indicate that there is actually an attack that achieves that but just gaps in the analysis Ren and DeVatos improved this in 2016 and made one over epsilon closer to one half okay, and construction from 2017 was a different type of proof of space, but it wasn't tight and Then in 2018 we had the first tight proof of space Although unfortunately in that construction the proof size increases proportional to log in over epsilon squared So it which is which makes a big difference for for efficiency. Let's say epsilon is you know one over a hundred And furthermore that construction used these very special types of depth robust graphs Which we don't have practical instantiations of we have asymptotic constructions But even heuristically we haven't really been able to demonstrate that it has this property so Why do tight proof of space matter? Well one can get better provable security for proof of space just like in any The goals of any tight security exercise We would like the gap between what the honest Prover has to do and the best possible adversary to be small as Either we have to tune the parameters for the worst possible adversary and make life extremely hard for the honest Provers or We set life to be reasonable for the honest Provers, and then there's some adversary that virtually doesn't have to use any space at all But a second motivation is that it's necessary for this new type of primitive called proof of replication and I won't go into proof of replication in this talk since it's actually extraordinarily tricky to define properly and It would require a lot of time to explain But I will talk about useful proof of space which are very related to proofs of replication And also greatly benefit from tight proof of space, so I'll explain that next so what is a useful proof of space? It's a proof of space where additionally That's a correctness requirement The Prover is able to use its storage to store files of its own interest So it doesn't have to waste the space as it's engaging in the proof of space protocol Cool. I can still use my space to store my movies What is an application well if you think about the application of proof of space to Building blockchains like a Bitcoin alternative One of the things that we're concerned about with blockchains is the so-called blockchain carbon footprint, you know proof of work Based Bitcoin it wastes a lot of energy since all the miners in the system Continuously use a lot of energy to maintain the system and proves a space We said is more eco-friendly because it can reuse the same space So it doesn't waste energy or require more consuming more space You know every minute But it does still doesn't do anything positive and you could say that the the space is still not being used for anything useful so a Useful proof of space would push this even further and have a so to say positive footprint because What the work that the miners are doing to maintain the the Bitcoin like system is can also simultaneously be used for data storage so consider a System where the miners are all mining for the blockchain and and using their space for for useful data storage One of the things that we'd be concerned about is that at some point in time one of the miners finds a cheat that lets it gain a huge advantage in the proof of space protocol and Unfortunately, let's say that that cheat no longer It's different than the honest protocol and so it doesn't necessarily have the same correctness property as the honest protocol It perhaps does not allow this prover to still store data useful data Well lose travels fast and all the other miners catch on and eventually this falls back to just a proof of space blockchain And it's no longer doing useful data storage. So one implication of tight proof of space is That we can you know sleep well or rest assured that nobody will find some adversarial strategy that saves Significantly from deviating from the honest protocol One other thing that we would be concerned about is how efficient is it is to extract data from this useful proof of space So let's say somebody wants to retrieve the movie that the miner is storing If that takes a really long time then that may be undesirable for the application And you can think of it as it being a little bit less useful, but maybe still useful for storing, you know archival backups But unfortunately a caveat of data extraction efficiency is that efficient extraction implies that the proofs of space are not Asynchronously composable, so let me illustrate why that is. Let's say there's a proof of space Miner and the the the prover and the the prover Encodes a data file f in its proof of space and stores this encoding of f as its persistent storage and then after that Initializes another proof of space where the data input which it can choose is The and is the same stores that it needs to pass the other proof of space Protocol so now it can only needs to store the the second doubling coding of f and by the Efficiency of extraction it could efficiently extract f During the online challenges and be able to respond to the challenges for both proofs Pretending to use twice the amount of space, but really only using half that amount. So that would be a problem But we're going to ignore that there are perhaps ways of dealing with that in in applications where there's a stronger Promise on what what what files are being stored by the system. So we'll ignore that for now so let me talk about in the remaining time the construction so The construction of many proofs of space works by labeling a Directed acyclic graph. So what is a graph labeling? we use a collision resistant hash function H and Label each node of the of the DAG with with the data input in this case We can just choose it to be the index of that node The source node is given the label, which is just a hash of its of its index and every other node is given a label which is derived as a hash of its data index and also all Its dependency labels so all the labels on nodes that have an edge to that node and so on and so forth until we label the whole graph and the way the proof of space works or this family of proof of space protocols work are by sending a commitment to the graph labels receiving a challenge, which is a subset of the nodes and Then opening the labels of the nodes In this challenge set and also their parents and the verifier will check that this hash relationship is correct At least on the the challenge indices so it can be shown that for special types of Directed acyclic graphs this gives a proof of space because if the prover forgets too many of the labels it will require doing Sequential work we just heard two talks on proofs of sequential work Require doing sequential work in order to re-derive those labels and respond correctly to the challenge. So let me give you just a Very high-level overview of the construction. I I give in this paper And so don't expect to understand all the details here I just want to give you a road map so that you have in your picture in your mind of what the next Few slides will be on the first step is to build a very weak not tight proof of space just from something called a depth robust graph And the depth of us graph that we require is Simply one where if you look at a very large subset So say on 80 percent of the nodes any 80 percent subset contains a long path directed path and intuitively this gives you a Not tight proof of space because if you for delete like 80 percent of the graph then in order to re-derive nodes any the label on any node within that it will require you to Compute labels along a long path and that requires sequential work the next step is to amplify This to get a tighter proof of space by layering depth robust graphs and adding bipartite expander Edges between the layers and we can show that that gets you a tight proof of space But not one where you can extract data from it efficiently and so the last step is to Make the picture prettier and I Called this local it's called localization as a technique in proof of space where We basically absorb the edges of the bipartite expander graph or project them into the layers We'll have to reverse the edges of the depth of work graph at every layer For reasons I'll get into later. It's in order to maintain The proof of space security, but what it ends up with is a graph structure where each layer can basically encode the labels of the previous layer and Extracting data from this can be done more efficiently So first the first step is this depth robust graph So remember I said a depth robust graph can be defined in in many ways as a very strong Depth of us graph is one where if you delete any constant size subset that constant size subset Maintains depth, but I actually can get we can get away with with using the weakest possible depth of us graph which just requires to be robust in very large subsets and So as I explained before Intuitively this if you just apply the the labeling game to this graph. It gives you a weak proof of space Now what is a bipartite expander graph There's two sets of there's sinks and sources and we say that it's an alpha beta expander if any subset of size alpha and a has It's connected to at least a beta fraction of B. It's said to have beta over alpha expansion So the first construction step two takes copies of a depth of us graph I mark here and read the edges of the depth of us graph. So every layer has a copy of a depth of us graph and Then adds the edges of bipartite expander graphs between them and The labels are derived on the last layer and stored So let me give you some intuition about why this gives you a type proof of space So consider Sort of just a naive attack that stores labels on the last level and forget some of the labels So let's say it forgets deletes the labels in in red Maybe redirving just those labels if we're only one layer would not require sequential work but because of the bipartite expander edges if we look at the dependencies of these labels on labels in previous Levels that are not being stored and would need to be redirived the dependencies expand as you move up in the graph until the dependencies include 80% of some level and since the graph was depth robust in 80% size subgraphs Redirving 80% of any given level requires sequential work And you can use this to encode data Simply by taking the last light layer and x-wearing it with the data file of interest I'm omitting some details on how you would modify the proof, but the main point is that unfortunately this sort of generic way of Using the proof of space to encode data Requires you to redirive Deterministically the labels and so it would be as inefficient to extract data as it is to generate or initialize a proof of space to begin with So the last step is modifying the structure of this graph and what we do is we Project the edges of the expander of the bipartite Expander into Each level and what this does it has the effect of turning each level into a Expander graph is an undirected graph. So it's an undirected expander, but it's a DAG So it's not it's not a you know, it's not actually an expand a graph is and So let and and so before I tell give you the security intuition The reason why this allows you to encode data in a more efficiently extractable way Is that now the dashed edges will be used to derive a label which encodes the same index label on the previous level? So if you look at like c6 it will be derived by hashing the dependencies of c6 in the same level and Using that as a key to encode c1 and simply using x or let's say and And so the the labels on the last level are actually encodings of the labels on the previous on on the first level and Those can simply be data inputs So let me give you intuition of why this still maintains proof of space security and why we need to reverse the edges in each level So again the graph is an expander as an undirected graph So if you look at the targets and dependencies of any given node, okay, that is large and if we go up one level then the targets of this node c14 become dependencies of c9, which is another label that we need to redrive the encoding of c14 and the target the dependencies of C14 become the targets which in turn become dependencies in the next level. So if you bump two levels up Then the dependencies expand just like it did with bipartite Expander graphs and so with roughly double the number of levels you get the same effect of dependencies expanding And this is not approved the analysis has to go through more careful analysis of all the things a prover could do But in the end we only need levels which are proportional to log 1 over epsilon where epsilon is the tightness we want and The extraction is paralyzable because Once you have all the labels on one level you can in parallel rederive the labels on the previous level until you get back the data So thank you. That is the end of this talk Are there any questions? I have a philosophical question about these proofs of space Yes, especially the reusable reusability of the space and the fact that you are having Useful proofs that allow you to continue to store other data. Yes, like your movies So if I'm trying to prove that I have two to the 40 memory Actually, don't have it but Amazon has it. Yeah, if Amazon can continue to store in their databases all the movies that other data that they are storing and They will let me rent the The memory for one second very cheaply because it doesn't cost them any extra and therefore people will be able to Pretend that they have the two to the 40 memory even though they are renting it out So did you think about the economic issues of proofs of space, especially when they are Yeah, yes. No, I I have thought about that and and and other people have Have as well pointed that out in general about any useful form like useful proofs of work as well would have a similar issues It's more that then then Amazon is sort of dominating the the mining of the system But it doesn't the the impact is that you don't have this effect where as in Bitcoin the miners are you know economically committed to Bitcoin is a network because they're invested in it since they need their mining hardware in order and it's not useful for anything else Amazon doesn't really care about Bitcoin continue to store files These would be extremely inefficient for Amazon to run so Maybe we don't want to make them too efficient but But it's an excellent point and there's a philosophical debate about that. Yes. Thank you There was another question here Also related to that property of reusing reusing the proof of space because I'm one because sometimes you would like to prove that you have New space so you so you wouldn't like this property of reusability Actually, so is that because I don't know so I have some proof and then I I'm doing a proof for someone else So I don't want maybe to reuse the same proof. I mean because yeah So reusability doesn't contradict that goal so the point of the proof of space is that it's a proof of persistent space you commit to Storing say a gigabyte of space and you can continuously prove that you're still using that gigabyte if you want to then use more space You can bump that up to two gigabytes by producing an independent proof of space with a different you know protocol identifier and If those compose then you're showing that you have twice amount of space The point is that you can reuse the same resources to continue to produce proofs that you're still using that and the fact that so It looks like in your construction like you have a lot more dependencies in the in the graph Yeah, the graph is basically roughly the same size as previous or the graph The size of the graph is also bigger like twice bigger and also and also the number of edges is bigger So maybe that poses some constraints on the prover. Yeah So the the graph that I have is is considerably bigger than the the data But the proof are only stores the labels on the last level, which is the size of the data But it requires, you know going through several steps of derivation in order to get there Yeah, in fact one advantage I didn't mention about the second construction over the first is that the first construction It because it doesn't have this locality property It requires the prover tonight naively the prover would have to use a buffer of size Twice the data in order to derive the the data storage because it needs to keep around the labels on the previous level in order to derive The next level whereas in the last construction it can once it has the the labels on a given level it can Basically replace them one by one as it's deriving the levels on the next and it only needs to use overall You know end storage in order to derive the last level rather than a buffer of size to yeah Are any any questions? So thank you for your talk I had a question regarding the Interactiveness of the proof in your pictures at least they were all interactive Can they be made non-interactive for the persistent proof of space? That's an interesting question so Classically no, however If you had and I and I an ideal realization of the random beacon one that would Just spit out random numbers at specific time intervals unpredictable random numbers And that could be used to replace the random challenges that come at intervals from the You know from the verifier, but that requires a realization of a random beacon and there's proposals for that But you know requires much stronger assumptions