 All right, so yeah, thank you So I'm gonna talk today about Escript and its memory hardness and this talk have been really seen as a continuation of Jeremiah's talk In the sense that I'm gonna tell you the other half of the story About memory hard functions and in particular talk about data dependent memory hard functions and which as script is one example So and I shall mention this is joint work with a set of amazing quarters Joel Alvin my PhD student Vinny Chen Cheshot Piatchak and Leo raising Leo and Joel also in the audience. You can also certainly ask them questions about it So I don't need to introduce the concept of a memory hard function anymore. So in particular Jeremiah has introduced the notion and let me just do one thing here. I have a problem with the screen All right, so I don't need to introduce the notion anymore here But for those of you that missed the talk or just came in just think of Dennis Moderately hard hash functions to be used in the context of password hashing or key derivation or proofs of work where moderate hardness is in terms of both time and most importantly memory consumption and There have been numerous practical designs that have been just mentioned that targeted memory hardness like Escript and Argon to D Argon to I Catena balloon hashing and many mores, but the focus of this talk is going to be on provable security once again so in particular provable security has been very effective in Validating real-world cryptography because it helps us in gaining extra confidence in particular against those attacks that we cannot envision yet but somehow in the context of memory hard functions and The password hashing competition where many of such designs have been proposed Things went a little bit differently mostly because we were lacking appropriate theory at that point in time or at least that's my assumption and Validation there as mostly occurred in terms of cryptanalysis and often enough intuition and As you have just seen there is now better and better theory to validate memory hard functions But we're still somewhat on a quest to find practical design that are also probably memory hard in the strongest possible sense So and if you want to close this gap, of course, there's two things you might try to do So one of them is what was advocated in the previous talk So we have a well-defined theoretical framework to design provably secure memory hard functions And we want to find those that are practical enough in this framework and that can be deployed But another alternative which is what we do in this work is that you can look around for designs that exist already in practice but that are out of reach for current techniques and but they resist attacks and maybe try to come up with a valid theory that Proves them secure and memory hard and this is exactly what we did in this work And we targeted a script and the main message here is positive So we provide a proof that the script function is optimally memory hard or ideally memory hard using the language from the previous talk And this is particularly surprising at first because well a script was the very first Candidate memory hard function It was introduced by Colin Percival in the very same paper in 2009 that introduced the notion of memory hard functions as we know it now It has found usage. So it is the object of our recently published RFC It is also used within cryptocurrencies for proofs of work It's using litecoin and other lesser cryptocurrencies and also It inspires other designs so you can find ideas from ascripts for example in argon 2d one of the two winners of the password hashing competition and Let me stress really that I mean this is interesting in practice, but also from a theoretical standpoint This is the first example of memory hard functions that we can prove optimally memory hard So even in theory not just in practice So let me put this a little bit in the context of the previous talk a lot has been reset already So I guess the main point here is that we distinguish between Memory hard functions that are data dependent and data independent right so data independent ones are those where the memory access patterns During the execution can depend on the input whereas I cannot depend on the input Whereas for data dependent they can depend on the input and ascript like argon 2d is an example of a data dependent one so so far The theory has been really focusing on Data independent designs are sure enough one reason is they have some attractive features that I'm going to talk a little bit More later resistance to side channels if you're concerned about that But a major point is also that from a theory standpoint as a much cleaner connection to graph peblings The Jeremiah has just illustrated that makes proving things much easier on the other hand Even though we just get we have this framework in place coming up with practical designs as you can see is work in progress Right, and we have even though this is the objective some controversy But we have attacks now that definitely show that there are structural weaknesses in practical designs that we already have On the other hand on the data dependent sign of things There's quite a lot we don't understand So we have designs around like ascript and argon 2d that seem to resist non-trivial Any non-trivial attack and speed up on the other hand we have absolutely no theory to understand why that's the case And they seem much harder to tackle and so that's exactly what we were concerned with here so a little bit more concretely What we target in When I say we analyze ascript we actually target an analysis of what is called romix, which is the core of the ascript function and It's a it's a hash function, which is a mode of operation for a simpler hash function She's not quite a hash function in the usual sense It's a build out of salsa and it's laying preserving but I'm going to obstruct the this is a just a hash function H and The structure is very simple So it starts with a single iteration of a hash function of the hash function H So you have an input m is going to become your first state x 0 Then you iterate the hash function to get n values x 0 up to x n minus 1 And then you apply the hash function once more to get what is the initial state of the second stage? The second stage is where the interesting things happen that ensure memory harness In particular we start by looking at the initial state as 0. Maybe you can't really from far It's a little bit small, but what you do is you now want to derive an integer between 1 0 and n minus 1 From the state as 0 you can do that for example by interpreting as 0 as an integer and then reducing it mod n and You interpret this as an index which points to one of the n values x 0 to x n minus 1 say points to 1 So you're going to take that value and now you're going to x or 18 to the state apply the hash function again And then you get the next state and so you can go on like this So you're going to do the same thing to the next state the revenue index say this case by accidents is the last one Take the value x or 18 to the state and go on and on until you get the final output now You have some parameter n here. There's no clear discussion. I think what are the right parameters for s crypt? I think suggestion. I've seen are things like n is equal to the 14 W which will be the length of the hashes so the size of the state Should be something like one kilobyte or larger, but I think like today's standards You might even want to have larger parameters All right, so we want to analyze this we want to prove with memory heart What that means is that we would like to prove a lower bound on the cumulative memory complexity There was already introduced in the previous talk So informally what it means is that we want to show that for every adverse or even parallel ones if we take the integral of this curve so we look at the memory consumption at every point in time and We sum them up. We want to show that this is large enough. Okay, and and This is lower bound for another metric which is less appropriate since it's less Resilience to amortization, which is the sd complexity which just looks at the maximum memory consumption times the time complexity So and we want to prove this And somehow we are very bad at proving lower bounds usually in crypto So what we need to do in these cases or the usual way out is that we are going to make some ideal assumption on the underlying hash function H and we model it as what we call a random oracle So we model it as a random hash function and then we just want to prove a lower bound on any parallel adversary They try to evaluate romics in this model That means we consider some adversary that takes as input a message on which he wants to evaluate the function And then proceeds in steps where in each step what he can do is he can submit a vector of queries So multiple parallel queries simultaneously to the random ideal hash function and also can keep some state For the next step and then in the next step the adversary is going to get this state He's going to get the responses to these queries and then this is over and over again And then just output something which ideally should be if the versa is correct for evaluating this the function should be the output of the function on input M right and then we want to Consider the cumulative memory complexity of such an adversary which typically we're going to model the memory size at every step by considering the size of us of a state as I and Plus we're also going to consider the size of the answers that the adversary gets back from the hash function queries This is very generous The adversary might need more memory, but we're going to prove a positive result a lower bound So it doesn't matter if the adversary uses even more memory then that's better for us Okay, so if we think about this metric now and we want to evaluate romics now there are two naive approaches that You could take to evaluate the function Okay, so the first one is you can just consider a naive sequential evaluation Which is what you will do when you run your function on your computer Which is you simply compute the values in the first phase and store them into memory And then when you get into the second phase What you're going to do as you're just going to take them from memory whenever you need them And they are going to be there and hence you can compute through the entire thing in linear time in a parameter n and The amount of memory that you need while it's as much memory as you need to store these n values that are w bit large So if you think now about our cumulative memory complexity you have a linear number of steps You can store up to n times w bits So the cumulative memory complexity is going to be n squared times w bits, right roughly. I'm omitting constants here Okay, now the other alternative strategy is that you say well, I don't want to waste any memory now And I want to do this with almost memory less So what you could do here instead is say, okay I'm just go through the chain at the beginning and don't remember anything get to the initial state of the second phase and Now see which value I actually need say for example now I see oh from my is zero I need the value x1 so now I'm going to recompute it Starting from x0 and I'm going to get that value and then add it into the state and then forget it and go on and then see What's the next value I going to recompute it and then just add it to the state and go on now These are a computation Recomputing can be expensive because we might have to do up to linear work to go through the chain again So an average say n over two steps. Okay, so overall we makes n square steps and However, the memory we keep is very small So if the order of w bits where w is the size of the state remember so again the cumulative memory complexity is something like n square times w Okay, so now we can start making some conjectures, which is what people have conjecture for long Which is that we have two extreme strategies one remembering everything sequential the other one Other one memory less so we could conjecture that now every other strategy might be at least as bad Right, but of course, they're just a conjecture and we really have no a priori. We wouldn't have no evidence Right, it could be that the real thing is like this Okay, so maybe there are some strategies that are much better than n square times w Okay, I know the plot is a little bit absurd, but it was kind of a theory talk So I thought they should put some plot just to make it more real wordsy I guess all right so so is the conjecture true or not and And so that's exactly what we confirmed so we confirmed that the conjecture is true at least up to so in multiplicative factor So what we show is that roughly for every possible adversarial strategy that wants to evaluate Romicks its cumulative memory complexity is going to be well roughly n square times w times a constant which is 1 over 25 and could be up for optimization in our proof and Also note that we don't quite get n square times w we get n square times w minus some Difference which is four times log n where n is the parameter that gives you the number of iteration This is somewhat inherent in our proof technique So there is a small loss even there even if you optimize the constant in front So we really can't get rid of it But not that it's not a big problem if you're concerned about a concrete parameter especially because typically w is quite huge So I'm like one kilobyte at least and log n should be at most 14 So you should not really be concerned too much about that for a log n, okay? Also note that despite the simplicity of this result. This is really not something easy to get there have been numerous Well numerous at least a few attempts to prove this result in the past For those of you who actually read Colin Percival's paper. I assume many of you Introducing a script. There is actually a proof there of memory harness for a script and it turns out I mean, I'm not going to go into it We have more evidence more evidence to it in the paper, but it's pretty much full-chloride by now that the proof is incorrect So there there is a bug there And also it targets a weaker security property than a weaker harness measure than cumulative memory complexity and there's also some previous work by myself and a partially overlapping set of co-authors that appeared at your crypt last year where we proved a weaker bound and for s great and That's weaker both quantitatively. It's it does it for short of being optimal and also it only holds for restricted Adversaries, I'm not going to go into it Now the important point here that was already addressed by Jeremiah is that this shows a separation between data dependent memory hard function and data independent ones so by result by by Jeremiah and Joel Alvin We know that for every data independent memory hard function there exists a strategy that Allow you to evaluate it with cumulative complexity, which is roughly n square times w over log n Whereas here we prove something of the other n square times w So there is a log n gap that is going to fall away Of course, we have to be honest here constants here matter and a log n and a constant factor Might end up being the same for reasonable ends But at least I seem to article you see that the difference can be substantial and you know With a similar constant you have to be careful about that because log n can be very large for reasonable parameters So why is proving something like this so difficult? So the main reason is that we are quite bad at dealing with bounded memory improvable security and We have to address the fact that an adversary that might evaluate this function arbitrarily We want to show that it's hard for any possible such adversary Tries to memorize partial information about the execution and could do whatever he likes to do that And so these states these intermediate states could be really arbitrary and we have to think about that in the IMHF case The proof was much simpler somehow because the very nice elegant reduction to graph paddling works The problem is that this trick really requires you that the data dependencies in the computation are all known a priori Something which we don't have here In fact the security of something like romix or script or argon 2d really depends on data dependencies being unpredictable Otherwise you cannot get as high security as we get The intuition indeed is that the reason why this thing is memory hard is that Once you move to the second stage this indices that you have to go fetch for which you have to go fetch values from the first phase are Unpredictable until you compute them and so in order to be ready to proceed fast You need to have many of these values stored in memory already Otherwise you're going to be stuck and need to recompute things And so that's why the intuition I just want to give you a four minute five minute intuition about why Escript is memory hard is to really think about the following game where you're thinking about the evaluation of romix think about having a game Between a challenger and an adversary the adversary learns the initial value x0 Can make queries to the random hash function the random oracle and now the challenger in the game proceeds in rounds and an each round Gives to the adversary a random challenge, which is an integer between 0 and n minus 1 and The goal of the adversary is to come up With possible using as less as few memory as few memory bits as possible with the value indexed by that challenge in the initial chain Okay, so I'm abstracting away completely the second phase of Romix with the idea that it's really about getting these challenges that are unpredictable and Going and fetch that particular value and now the adversary want to go through all of the challenges and be able to compute So in fact we can make an extra step here just for the purpose of this talk Which is very intuitive and even model a simpler version of this game where the adversary only remembers similar to the pebbling case Just the values Just values x0 to xn minus 1 or a subset of them and so we can really think of this also as a special case of a pebbling game or a variant of a pebbling game where we look at the line simply and Putting a pebble on a particular node on the line with nodes from 0 to n minus 1 Indicates remember in the corresponding value xi if you have a pebble on node i and then you have the same rule as in the pebbling game we had in the previous talk and In particular multiple moves can be done in parallel So something the adversary could do in trying to evaluate it will start by putting always a pebble on the first node Then you could add pebbles you could move pebbles Nobody's put a pebble next to a pebble which is already there and moving parallel to support parallelism of the strategy Right and so in particular the challenge that the round game with challenges in the setting will simply have now The adversary learning a challenge at every round say for example here for and now the goal is to put a pebble on four For example now this will be done in four steps If you don't have anything on the line because you have to start by putting a pebble and then move step after step Then say you get another challenge now you have to put more another pebble move there into step But now if you get another challenge on six now You're lucky you have a pebble nearby in four and you can move fairly quickly to six right and you're interested now in Seeing what's the complexity of an adversary to go through this game and in particular note here that the memory complexity The particular stage corresponds roughly to the number of pebbles times the number of bits to store a value and now We want to lower bound the complexity of any strategy that successfully goes through this game and again This is completely arbitrary the strategy could be and will do anything so it will have different amounts of pebbles a different stage Whenever he gets a new challenge, you could have a different number and a different subset of pebbles on the line And it will take different amounts of times to answer that challenge And so we will like to show that no matter what we do. There's a lower bound which is squared in N For the curve here, which is defined by the red lines So this is the core of our proof and The basic idea here is that Whenever you have a certain pebbling configuration and you learn the challenge The time you need to answer that challenge is inversely proportional to the pebbles you have on the line This is quite natural because if you have a certain set of pebbles have so much time, okay? Oh five, okay. Good. So Yeah, I think I'm gonna be faster. So If you look at the set of pebbles that you have here on the line There could be anything say you have P of them and now you want to see how many Challenges you could answer within a certain amount of time steps say two time steps In fact, we count these as three in the paper Then there are only so many challenges. So any challenge of landing does this Highlighted area who'll be answered within three steps anything else requires more time and So if you do the math what you see is that this gives you a very nice probabilistic trade-off Where we see that what at least with probability one half The time that you need to answer the challenge the probabilities over the choice of the challenge It's at least n over or larger than n over twice the number of pebbles you have or in other words What that means is that we probability one half You have a trade-off in the form that the time you need to answer the challenge times The number of pebbles that you have on the line is at least n over 2 Right, so this takes us much closer to what we want, but we are not quite there yet In particular what this means now assuming that this trade-off just for the simplicity of my explanation to make it simpler Holes for every challenge What this means is that if we look now at the amount of pebbles that the adversary has on the line Every time a new challenge is revealed And now we look at the rectangles that are defined by taking as a height this number of pebbles And the width of the rectangle is the time it takes to answer that challenge Then what this trade-off tells us is that the area of these rectangles of all of them What is all of them for which is probabilistic trade-off holes is at least a linear in n So if you have n challenges now, it's intuitive to think that the overall area should be n square Unfortunately, that's not quite what we need because we need to look at this curve Which is defined by the amount of pebbles we have at every time and we have no guarantee that after an adversary learns a challenge You might just not drop everything it has and just remember for example the closest pebble to the challenge and forget about everything else In fact for the last challenge, for example, that's the most reasonable thing to do everything else will be rather stupid So Which that means is that the area which is the blue area here is n squared, but we need to lower bound the the area Which is defined by the red histograms Now I'm not gonna go into details how we do this But the basic idea and that's one of our two major technical contributions here Is to look not at the area that we have under the curve after the challenge is revealed But to realize that in order to have a number of pebbles, which is fairly high Then we need to put them there and so there is some work that we need to do And so what we look at is not at the drop or the behavior of the curve after we learn the challenge But we use a generalized version of the trade-off to look at what happens before we learn the challenge The other main challenge of course that I haven't talked about is that here I am looking at the simplified version in terms of pebbling but obviously we want to deal with adversaries that can store any type of information about the previous states of the execution and That's required some more technical work, which I'm not going to talk about here But that's very important now. I have some lengthier conclusions here that I want to start going through So the first one is that I think this is very it's an important result because it's a good example of an interesting theory problem that however validates a very practical design I just have a quote here from Phil Rogo with essay on the moral characters of cryptographic work Where in fact he motivates this as one of an important line of work to try to understand these modes But that's not the one the only one thing here So it is really it only gives it gives us also a very good example of a practical design with strong provable memory harness guarantees and also the first one which is really practical and as such provable guarantees and also it's really the first example even in theory of an Optimal in memory hard MHF now a question you might ask because I've been focusing on s-crypt and romix is what about argon 2d? So argon 2d for those of you who are familiar with it as a very similar design to s-crypt the main difference being that you don't have such two phases of Of execution, but you have a sliding window of values that you're going to use and point back to to insert in the execution Now we don't have this written down in the paper That's why I have a star but the same technique should easily give the same lower bound But so we don't have this in written yet. Okay now Another thing that I wanted to say is that that also came up in the previous talk is About the issue of DMHF versus IMHF so data dependent memory hard function versus data independent memory hard function so at least from With respect to the goal of achieving High and optimal memory harness in a provable sense so DMHF seems to be easier to get right and Also provably achieve a higher memory harness at least in a asymptotic sense So the main problem here that you might face Despite the fact that they're also harder to prove Memory hard but other than that of course the main concern are side channels and this is something that was also mentioned by Jeremiah There are a possible concern your memory access patterns do depend on the input and so Here the main point is that you should be really assessing Whether in the application you're looking at you're more concerned about memory harness being strong Or whether you're concerned about possible side channels So there are applications like proofs of work where for sure side channels do not seem relevant at all For password hashing what you have to see but if you're not concerned about that then there's really no reason why you shouldn't be using data dependent memory hard function since the guarantees really seem to be stronger and Finally just a few words about proofs So I really think that there's been some debate also for those of you that follow CFRG On the role of proofs in the context of memory harness. I really think they are important So if you have a mode that has a proof And is as efficient as one without I don't think there's any reason to discard the one without with the proof And of course, it's not the only thing. There's a lot of interesting questions Related to how tight these proofs are and whether the models are the best ones but still I think the proof remains an important component and Of course when I say proofs I mean you have to target concrete security and be as concrete as possible, especially in this domain We're speeding up things by a constant factor really matter and makes an impact You have to really work and make an effort into having bounds that are as tight as possible Okay, so that's actually everything I wanted to say so there is a paper on imprint About the results so feel free to have a look With a quick question. So there there was a constant in front of your bound, right? So does that give any inspiration for a new attack that would be a small constant factor speed up? I Don't think this is really tight. I mean if there are as many ways of Like relaxing things in the proof that might make might make things a little bit better So for example Yeah, just requiring that the usage of memory is slightly higher try to prove So it's not clear really and there's some work there that should be done and try to tighten them up So I don't think one over 25 is the final answer