 Okay, thank you for the introduction. As mentioned, I'm going to talk about proof of work in the context of cryptocurrencies. Now, the term proof of work was first introduced by Dwork and now in the early 90s as a computational technique to limit access, excessive users' access to a shared resource. It's a very general concept and in the context of cryptocurrencies, the entities that generate those proofs are called miners. So, how are the proofs generated? Each miner gets a list of recent transactions and it is going to bind those transactions into a block and perform a computational action on it. Once the computation succeeds, it has a proof that it performed the computation. The proof itself is just a string that the miner concatenates to the block and then submits the block to the network. What verifies the proofs? Each node on the network should verify the proof and assuming that the proof is correct, it adds the block to its local copy of the blockchain. Why do we need proofs? Why do we need to work in order to add a block to the blockchain? The basic idea is that if it was easy to add new blocks, it would open a door for several attacks and manipulations on the blockchain, such as double spending, which is spending money more than once and therefore the mechanism of proof of work is an important layer in the safety of the blockchain. We will now go over the most basic proof of work, which is the one that is used in Bitcoin. In Bitcoin proof of work, there is a global adjustable difficulty parameter that is denoted by D that controls how hard it is to generate a proof. The proofs algorithm is going to take as an input a challenge i. The challenge i can be deterministically computed using the previous block and the transactions on the current block. The goal of the proof is to find a nonce n, such that if you concatenate n to i, executing SHA-256 on the obtained string, will be less than the difficulty parameter D. Since SHA-256 is safe, the best algorithm is an exhaustive search and the proof is going to be denounced n. Of course, there's also the verifiers algorithm that is being executed by each node on the network. What's the simplest algorithm you can think of? The verifiers algorithm takes as an input a challenge i and a nonce n and if the condition is met, he adds the block to his local copy of the blockchain. Let's try to think whether this is a good implementation of a proof of work. The major advantage of this proof of work is that the verification algorithm is very efficient. If the verification algorithm was low, meaning it would take a lot of resources, then since each node on the network executed, you could do denial of service attacks. But perhaps a more serious problem would be that if the verification algorithm was low, blocks would spread much more slowly over the network. This would create inconsistencies between the internal state of the blockchain and the different nodes on the network. This state is often referred to as a fork. But there is also a disadvantage in this simplicity. The disadvantage comes from the perspective of the prover. The prover's algorithm can be implemented very efficiently on dedicated hardware, on ASIC, application-specific integrated circuit. In fact, the cost of computing the prover's algorithm is more than 10,000 times less on ASIC compared to general CPUs. Well, this is a problem. Why? Now it is not recommended to mine on regular CPUs. Today, my inks are done in huge angers containing thousands of dedicated hardware components that are specialized in computing SHA-256. It has become a centralized industry, which is in contrary to the philosophy of Bitcoin. So in this talk, I'm going to present an approach to fight decentralization using what is called a memory-hard function that I will introduce shortly. Then I'm going to describe a concrete proof-of-work scheme called MTP that uses this approach. This will provide the background, and in this work, I'm going to describe a subtle weakness in MTP that allows us to break its security claims as we will see later on. What is the purpose of memory-hard function? We said there is a gap between the implementation of the Prover's algorithm on dedicated hardware versus general CPUs. Therefore, the purpose of memory-hard function is to narrow this gap. The idea is that on the one hand, the standard execution of this algorithm would require significant amount of memory in order to compute, and on the other hand, any attempt to use less memory would result in a high computational penalty. Why does this approach narrow the gap? At a high level, I would just say that since memory consumes a large amount of on-chip area, there is much less room left for CPUs. Next, I'm going to describe a specific memory-hard function called Argon2D that was presented by Biryakov and Khovratovich. Argon2D takes as an input a challenge i, and what it is going to do in a very schematic way is to compute a very long array of entries in sequential order, meaning x1, x2, and so on. From now on, until the end of this talk, a block is simply an entry in this array. The output is the last block in the array. So, how is each block xI computed? The first few blocks are computed directly from the challenge i, but almost all other blocks are computed using a compression function f. The compression function f takes as an input two blocks from the array and returns a single block output. So, xI is equal to the execution of f on the previous block and another block that is located at phi of i. Phi is called an indexing function, and in Argon2D it depends on the value of the previous block. For simplicity, an approximation of this function is to take the value of the previous block, modulo i minus 2. Let's take an example. Let's say we want to compute x7. In order to compute x7, we need the previous block, which is x6, and another block that is located at phi of 7. Phi of 7 is equal to the value of the previous block, which is x6, modulo 5, and this case is equal to 3. Why is Argon2D considered a memory hard function? Assume we store in memory only a small fraction of blocks, let's say one-fifth of the blocks, keeping each fifth block in memory. The goal is to compute xi. Let's take a look at the computation graph of xi. In order to compute xi, we need to compute a computation graph in recursive order. If a block is not present in memory, these are the blue nodes in the computation graph. We need to find the block value, and if there are a lot of blue nodes in the computation graph, we are going to pay a high computational penalty. In the following table, you can see the best known time memory trade-offs for Argon2D. The penalty increases very sharply, and therefore, Argon2D is considered a memory hard function. We can try to think what can be done in order to use Argon2D as a proof of work in the cryptocurrency settings. The problem is that the verification is going to be inefficient. The verifier will have to compute the entire array in order to verify that the proof is correct. This leads me to the main subject of this talk, which is the MTP proof of work scheme. MTP was presented by the same designers of Argon2D, and the designers claim that MTP simultaneously offer both memory hardness of the prover's algorithm, and also efficient verification. Achieving each property separately is trivial, but getting them simultaneously is much more challenging. Also, MTP is a concrete proof of work scheme, which means the designers suggested concrete parameters for it, and it was originally planned to be incorporated into the Z-coin cryptocurrency. One of the main components of the MTP proof of work scheme is the Merkel-Hashtree. Merkel-Hashtree is a cryptographic structure that allows verification of small segments within a large memory in a very efficient way. We defined two actions on such a tree. First, building the three. The build function takes as an input a long array inks and outputs the final hash in the Merkel-Hashtree, which is also called a commitment, and is denoted by pi. The second action I'm going to describe is opening of a block. The opening of block XI is an evidence that the block is located at index I. What is the idea behind this cryptographic structure? The idea is that given the commitment pi, it is computationally hard to lie about the content of XI. Now I'm going to describe the proof as algorithm in the MTP proof of work scheme. The proof starts by computing the argon 2D function and then building the Merkel-Hashtree to obtain a commitment pi. Then the proof is going to choose a random nonce n and start computing a series of pseudo random values Y0, Y1, and so on. Out of each Y, the proof is going to derive a pseudo random index and is required to provide the block that is located at that index. This process continues for L pseudo random blocks and as in Bitcoin, the final hash is compared against the difficulty parameter D. If it is not smaller than D, the proof is going to choose a new random nonce n and start all over again, but if it succeeds, he found a proof and this proof includes the commitment pi, the nonce n, and the opening of 3L blocks. Given the proof, the verifier can exactly compute the same chain of values in order to verify that the proof is correct. So for each pseudo random index, the proof is going to open the block that is located at that index, but also the two blocks from which it was created according to the compression function of argon 2D. The idea is that once the pover has committed himself to these three blocks, he cannot against the commitment pi, he cannot lie about these blocks values. As I said earlier, MTP is a concrete proof of work scheme with given parameters. The verification is very efficient and the question is, what about the memory hardness of the pover's algorithm? MTP is built on three building blocks. The first one is the argon 2D memory hardness. The second one is the Merkel-Hashtree and the third is the series of pseudo random indexes that are based on the Fiat-Shamir transform. Given that these building blocks are safe, is it implies that MTP is also safe? The goal of the attacker is to show that MTP is not a memory hard function by computing MTP proof with a very small amount of memory, but also a reasonable computational penalty. Once again, it should be emphasized that the computational penalty depends on the size of the argon 2D computation graph. And we saw earlier that if we kept a small amount of blocks in memory, we are going to pay a high computational penalty. What would happen if the pover tries to cheat? What would happen if the pover tries to compute a different function from argon 2D? We focus on the most interesting pover, which cheats by including an epsilon fraction of inconsistent block. This inconsistent block will not computed according to the compression function of argon 2D. Now, what needs to be done in order to generate a valid proof? In order to generate a valid proof, the pover needs that all the 70 pseudo random indexes would land on consistent blocks, on the blue blocks. If a pseudo random index would land on inconsistent block, the pover will be forced to open the two predecessors of this block. And since this block is inconsistent, the verifier will catch it. A fairly simple probabilistic analysis shows that this occurs with probability one minus epsilon at the power of 70. And if a cheating pover is going to use a very large amount of inconsistent blocks, he is going to pay a very large penalty. And therefore the pover is forced to use a very small amount of inconsistent blocks. What seems intuitive is that if pover is a very small amount of inconsistent blocks, he computed function that is similar to argon 2D. And argon 2D is considered a memory hard function, and therefore MTP is also memory hard. That's what the designer thought, but we are going to show that it is not the case. We are going to show a pover that computes MTP proof with a very small amount of memory, but also a relatively small penalty. We call the indexing function phi of argon 2D. The indexing function depends on the value of the previous block. So a cheating pover is going to exploit the fact that argon 2D accesses its memory in such a way that is determined by its previous computation on the values of its own blocks. The goal of the cheater is to cheat in a very small amount of block, such that the computation graph will be much smaller. So the main idea of the attack is that a cheating pover will compute a different function from argon 2D that has weaker computation memory trade-off resistance. Let's see how it can be done. Assume we store in memory only a small set of blocks denoted by S. These are the green blocks in their way. And now we are going to manipulate each second block such that the blue blocks are going to be computed using blocks from the set S. Meaning the indexing function of each blue block is going to point towards a block from the set S. Note that the red blocks are now inconsistent. These blocks were not computed according to the compression function of argon 2D. The problem is that half of the block are now inconsistent. And the pover is going to pay a very large penalty in order to land on 70 consistent blocks. That's not good enough. In fact, we can generalize this idea. We can divide the memory to larger segments of five or even 20 blocks. The inconsistent block on each segment, the red block, is also called a control block. And it is chosen such that the following T minus one blocks are going to be computed using the set S. Meaning the indexing function of each blue block is going to point towards a block from the set S. There are several problems here. First, computing control blocks is going to be relatively expensive now. And also we will have to store those blocks in memory. In fact, there are several optimizations we can do. There is no need to save the green blocks in memory. We can just save the red blocks, the control blocks, and make sure that the blue blocks will be computed directly from the red ones. Let's see what we gained from this attack. Let's say we want to compute x15. In order to compute x15, we need a previous block, which is x14, and the block that is located at phi of 15. We have manipulated the indexing function such that it will point towards a control block that is stored in memory, and so on, and so on. So we have managed to manipulate the computation graph in such a way that the computation graph is much smaller. In fact, it's a single branch. Let's go through the phases of this basic attack. We start by computing the control blocks and building the Merkle-Hushtree to obtain a commitment pi. Then we need to generate a consistent proof. In order to generate a consistent proof, we need that all the 70 pseudo-random indexes would land on the blue blocks, on the consistent blocks. If a pseudo-random index would land on inconsistent blocks like this one, the pover would not be able, the pover is forced, to open the two predecessors of this block. And since this block is inconsistent, the pover will catch him. Let's concentrate on the first phase of this attack. What is the computational complexity for generating control blocks? We denote by a big T the size array, and by a small T the segment size. Since each T consecutive blocks contains a single control block, the probability to land on a control block is one over T. Since each control block controls the next T minus one blocks, we need to satisfy T minus one condition, and therefore the complexity is as follows. Note that the complexity is exponential in T. And the problem is that for a large T, we are going to spend a significant amount of time in order to generate those control blocks. One of the techniques we used in order to speed up the only computation that begins once the challenge I is received is preprocessing. At first sight, it may not be clear how a cheating pover can benefit from preprocessing, as the function has to be applied to the challenge I, whose values cannot be predicted in advance. The idea is to partition the array into two parts. The first one is going to be a small prefix that are going to be computed honestly online using the challenge I. And the second one is going to be independent of the first one and also contain most of the blocks. In order to maintain the property, that the complexity of the preprocessing phase, we want to store a few fixed value blocks and only there the control blocks. We will make sure that the control blocks, that the blue blocks will be computed using the few fixed value blocks and not depend on the undetermined prefix. So during the online computation, the only thing the pover needs to do is to compute the small prefix and finish the miracle hash tree. We stress that the preprocessing phase can be very easily parallelized and requires less than one megabyte of memory. We also note that additional optimizations are possible and described in details in the paper. Now we are going to evaluate this attack. In a standard metric, we can show that we can compute a poof which is roughly 110 times more efficient compared to the honest pover, of course only after practical pre-computation. The potential effect of this attack is that now an attacker can overwhelm the network with malicious poof with very small computational power. This may cause significant problems such as double spending, and in this sense this attack is critical. In conclusion, MTP is a concrete poof of work scheme with several cryptographic building blocks. Even though each building block is secure, there is a subtle weakness in its indexing function, in its infi, that leads to a very efficient attack. The most natural way to fix MTP is to change the indexing function so that it will not depend on the values of X. However, there are still other weaknesses. Essentially, a significant impact of this attack is that MTP was withheld from deployment in the Z-coin cryptocurrency. So if you want to design a poof of work with desired properties such as efficient verification and also fight this centralization, it is still considered a difficult challenge and I expect that in the near future there will be further research on this area. Thank you for your attention.