 Yeah, hi everyone. Yeah, so everyone can help me, right? Thanks for coming. Today I'm going to talk about as script as well as our proof that it is maximally memory hot. The paper is showing work with Joe Alvin, Kshishto Piachak, Leo Rezin, and Stefano Tsairo. As motivation for this talk and work, let us start with the setting of password hashing. To be stored securely, passwords are typically hashed, often together with a public random sort. A good question is then how to design the hash function F to prevent attackers to recover the passwords if they learn the hashes. To achieve this, we want the hash function F to be more dutifully hard. This means that should be fast enough not to slow down on its users while authenticating. But it should be slow enough to seriously impact the feasibility of a brute force attack. Traditionally, modern hardness has been expressed in terms of time complexity, but this is by now understood not to be the best choice. In particular, time complexity can vary a lot across different platforms and this is a better choice when adversaries can use some custom-made hardware like ASICs. For example, such hardware may exploit parallelism by planning to achieve massive speed-ups and have lower energy costs. Consequently, the adversaries average cost per hash evaluation can be much lower than that of honest users. Of course, I'm not defining precisely what cost is here, but ideally, we like to ensure that these costs are as close as possible. To achieve this, many recent designs have exploited observation that memory costs are largely platform-independent. This has particularly led to a notion called memory hard function or MHF for short. A memory hard function F requires large memory to be evaluated in a reasonably fast way. However, with small memory, the evaluation takes much longer. More precisely, our goal is to maximize cumulative memory complexity or CMC for short for any possibly parallelized strategy to evaluate the function F. More specifically, let T be the time needed to evaluate the function F. The CMC is defined to be the sum of memory usage over the time T. Memory hardness was a defective requirement in the recent password hashing competition, and many practical designs of password hashing schemes are indeed meant to be memory hard, including the winners, Argon2i and Argon2d. Ideally, we would like to prove that these functions are memory hard, but finding such proofs is a difficult problem. So, a good question is whether we can build provably memory hard functions. The answer is yes, and several elegant MHFs were designed and approved to have memory hardness. All these MHFs belong to the category of IMHFs, whose memory access pattern are inputs independent. Recently, however, Alvin and Blocky raised two issues regarding IMHFs. First, no IMHF can achieve optimal memory hardness. And second, many practical IMHFs are vulnerable to parallel attacks, which make them even less memory hard. Facing these issues of IMHF, we therefore ask if data dependence can help. Surprisingly, the answer is positive, and our paper gives a rigorous proof showing that the data-dependent MHF, called S-CRIPT, is indeed optimally memory hard. S-CRIPT is the very first conjecture MHF proposed by Colin Percival in 2009. It is used as a proof-of-work hash within the cryptocurrency Litecoin. Moreover, it inspired the design of one of the password-hashing competition winners, Argon2d, and it is covered by AlfC standard. The most important take-home message is that S-CRIPT is the very first example of functions with provably and optimal memory hardness, also not to mention that it is practical or ready in use and relatively simple. We emphasize that finding such proof has been a surprisingly hard problem, and previous proofs were either incorrect or restricted. About optimality, we recall that no IMHF can possibly achieve optimal memory hardness, and the proof techniques for IMHF are not enough here. Let us start the journey of exploring S-CRIPT. First, I will introduce the S-CRIPT function and the intuition of its memory hardness. Next, I will prove the optimal memory hardness and finally conclude the talk. Let me first show you how S-CRIPT works by introducing its core components called ROMIX. It is built upon a SASR20-based hash function H with output length W. However, we ignore the implementation details of H and model it as a random oracle. On input x0, ROMIX first computes a chain of values x1 to xn-1, iteratively by invoking the hash function H. Then it applies H again on xn-1 to get a value b0. In the second phase, we first interpret b0 as an integer and it computes an index c0 as b0 more n. This c0 points to one of x values. For example, suppose c0 equals 2, then this points to x2. Then the xor of b0 and x2 becomes the input to the hash function H. The output becomes the next value b1 and this defines a new index c1. Then again, if c1, for example, is n-1, we need to xor xn-1 to the states, in order to proceed in the evaluation. So on and so forth until we finally get the bn, which is the output value. ROMIX invokes the hash function H for 2n times where n is a tunable parameter. A practical parameter choice is, for example, n equals 2 to the 14 and w equals 1 kilobytes. Note that the interesting part in the evaluation of ROMIX is what happens in the second phase. In particular, the useful way to think of this is to see the synthesis as unpredictable challenges that determines which values are needed to proceed in the computation. To answer each challenge cj, we need to recover the value xcj. This will in turn allow us to learn the next challenge cj plus 1. Finally, we need to answer all challenges to complete the evaluation. This process of answering challenges is exactly what will make ROMIX memory hard and we want to abstract it a little further to understand why this is the case. More concretely, we consider a simplified setting where the challenges in the second phase, as opposed to be generated by H, are truly random and independent. This results in a game called ROMGAM that captures the essence of why is ROMIX memory hard. The game consists of n rounds and each round j, the challenger first generates and reviews a uniformly random challenge cj and the adversary is required to respond with xcj. If this answer is correct, the challenger will initiate the next round and the game terminates after all challenges have been successfully answered. The adversary's goal is to reduce its own CMC to answer all the challenges. We'll recall that CMC is the sum of memory usage over time. Let me give you some intuition about some simple possible strategies the adversary may use in this game. The simplest naive strategy answers all challenges immediately but requires large memory. Concretely, during the initialization phase, the strategy computes all x values and stores them in memory. Then upon a challenge cj, it can return the value xcj right away. Let n be the number of challenges and w be the number of bits per challenge. The strategy needs theta n time complexity and theta nw memory size. Since the CMC now is omega n square w, and just remember the value n square w is very important. Of course, we can also play the ROMGAM with small memory size, but it takes much longer to complete the game. In particular, the strategy only stores input x0. Upon each challenge, cj starts from x0 and evokes the hash function cj times to recompute the value xcj. The strategy has now a constant memory size, but expected time to answer each challenge is theta n, since the CMC is still omega n square w. Clearly, the previous two strategies are only special cases which have consistent memory usage over time. However, a strategy could be far more general, and in particular, the memory consumption can vary a lot across the computation. Concretely, to save memory, the strategy could forget some of its information and recompute the values at a later point in time. For example, just before a challenge is revealed, the adversary may store most of the x values so that it can answer the challenges quickly, but after that, it can reduce its memory consumption rapidly to minimize the CMC. But still, we would like to prove that no matter what strategy does, the area under the curve is always omega n square w. This gives us a first intuition that helps us proving memory hardness. Namely, the even general strategy answering a challenge fast requires large states. That is, to answer the challenge quickly, the initial stage should be large, whereas if a strategy only keeps the smaller data in the memory initially, it must take longer to answer the challenge. To see why this is intuitively true, let's assume that the adversary or the strategy knows p of the x values at the step right before the challenge is revealed. Given the challenge, CJ, the number of steps to answer the challenge is at least the distance from CJ to its closest preceding index i, where x i is stored. We observe that the average of such distance over the challenge is at least n over 2p, and which implies that the expected time to answer the challenge is at least n over 2p, where p is the number of x values stored initially. This holds true regardless of parallelism as the computation of the x value is inherently sequential. Unfortunately, to translate this intuition into memory hardness proof for ROMEX, we face three technical barriers. First, the adversary might store arbitrary information to answer the challenge quickly with small states. For example, it can store x of x i values or half of x i value, and this is not a hypothetical concern, as the previous work at EuroCrypt 16 showed in some wrong game, the general strategy can help actually help, and the proofs of the previous work only consider restricted strategies. Second, the single short memory time trade-off only gives us the memory lower bound at a single step, which is not enough to lower bound the CMT, as the general strategy can vary its memory consumptions or loss through the computation. In particular, given the trade-off, the previous work only shows a suboptimal CMT lower bound, and the last in ROMEX, the challenge CJ are oracle-dependent as opposed to truly random and independent. I will focus on the first two issues at the talk. For the third issue, the resolution is referred to our paper. Next, we come to the part of proving optimal memory hardness. To formalize the computation of ROMEX by an adversarial strategy, we use the parallel random oracle model proposed by Alvin and Shapinenko at stock 15. This model is very general and is used by Priv's works. The adversary first gets the initial input states X0. At each step, the adversary asks one batch of parallel queries to the oracle H. After receiving the answers, it can perform unbounded computation and update the old states into new memory states SI. Finally, it produces the output, which is meant to be the output of ROMEX on input X0. The goal of the adversary is to minimize CMC, which is a sum of states size over time. Our mainstream shows that for any other adversary evaluating ROMEX, the CMC is at least around N squared W with overwhelming probability. The forelog and loss in the term is inherent in the proof. And note, omega N squared W is clearly the best possible CMC lower bound for any construction making N queries to the underlying hash function H. As a naive strategy can just compute this by making an oracle cause and remember all oracle answers. To prove the theorem, we first show a single short memory time tradeoff to obtain lower bounds on memory. In particular, given the challenge SI, let TI be the time needed to answer the I's challenge, we show that at the step right before the challenge is revealed, the amount of memory used is inversely proportional to the time TI. That is, the green times the orange is at least omega NW. We have already seen there is a tradeoff in a special case where the adversary only stores X values in its memory. Here, we show that even if an adversary stores arbitrary information, the memory time tradeoff still holds true. More concretely, assume an adversary is first given a state Z where Z can be the result of arbitrary computation on oracle outputs. For example, it can be the pre-computation of oracles entries or XOR of XI values. And next, a uniformly random challenge C is revealed and the goal of the adversary is to recover XC and let T be the time A used to recover XC. Our goal is to lower bound the size of the states as a function of N and a time T. We emphasize that the possibility of computing H oracles which leads to small states and fast answering is not just a hypothetical concern. As the previous work shows that in some round game, the adversary can indeed use some non-trivial computation on H oracles to have small memory states and answer challenges quickly. Surprisingly, in our paper, we show that this cannot help for ad script. In particular, we prove a strong call lemma shows that for all adversary A and most oracle H, if the state size is around PW bits, then the time T is at least N over 2p with probability half. As one of our main contributions, the proof is highly non-trivial, but a high-level idea is to prove by contradiction. First, suppose time T is small for most C, that is, the adversary answers too fast for most challenges C, then the adversary on input Z must be able to output or query many XI values without querying the oracle H first. Therefore, we can argue that this enables us to compress the oracle using the state Z. Nevertheless, this cannot be true for too many H, as random oracle is incompressible. We have shown the memory time trade-off for arbitrary adversary, which also solves the first barrier that an adversary may store arbitrary information. Next, to solve the second issue that the memory usage varies a lot over time, we exploit and generalize the single-shot memory time trade-off to prove the optimal CMT lower bound for the round game. In particular, consider the execution of a general strategy and let TI be the time to answer the ICE challenge. The single-shot memory time trade-off gives us lower bounds on memory only at a step right before the change is revealed. Concretely, it tells us that memory used at this step is at least NW over 2ti. However, this doesn't give us CM's lower bounds, because the memory usage can be much different at other steps for which we don't have memory lower bounds yet. To address this, we generalize our proof of memory time trade-off to give us lower bounds at arbitrary time steps. Most specifically, consider some arbitrary points in time. We show that the memory size is also inversely proportional to the time until answering the next challenge. Since the memory usage at every time step can now also be lower bounded as functions of N and time needed to answer the next challenge. Therefore, for now, we have a memory lower bound for every time step instead of only single step for each challenge. Using this for each round I-1, we can show that the sum of memory consumption can be lower bounded as NW over 2 times the function of TI-1 and TI. And by adding these lower bounds over all rounds from 0 to MIs-1, we have that the CMC of the round game is lower bounded by the blue area, which is omega N square W. And this finishes our proof. Finally, we conclude the talk. In summary, by showing as script is maximally memory-hard, our paper gives the first optimal memory-hardness proof, which validates a practical MHF design. And also, an open problem is whether our proof techniques can be extended for organ 2D, which also shows a similar design with raw mix. This is all I want to say and thank you so much for listening. Thank you. Any questions? So in this case, I have one. Connecting your talk to the talk earlier in the day by Evgeny Dodes, what do you have to say about the non-uniform adversaries who are allowed arbitrary pre-computation? So they give compression arguments with respect to some... So they give some compression argument with something like collision-resistant one way. And our also have some also compression arguments with respect to arbitrary information. Even adversaries do arbitrary information, it takes a long time to answer it. So that's the property we'll prove. And it really has some similar ideas with respect to compression arguments. Do you have any questions? Thank you. So I think from practical security point of view, we are disappointed because we would like to see better what are the assumptions in your work in terms of the timing of memory. I mean, this month Intel is releasing a type of flash memory just 1,000 times faster than what we had before. So this happens overnight and you have cache memory. So we don't understand really what it means to be memory hard in terms of timing of the memory. So could you explain this better? Oh yeah, so first, if you only require the time, then you have to make the inputs really large. Say the input length should be n, which is not interesting. And if you require like even n minus 1 for memory, then the time you need is quadratic, which is too long. So what we can hope for is only a memory time trade-off. And so that's kind of disappointing, but that's the best we can do. And also, yeah, maybe because this, also there's some, also about the caching memory things. There's, yes, this kind of definition may not be the perfect one. And there's some new work working on a bandwidth hard on memory functions that try to model it. So we all also want to work on that side. Yeah, thank you.