 All right, thanks. So I'm excited to tell you about data independent memory hard functions. I should mention this is joint work with collaborators at Astiastria, Joel Allwin, and Kristoff. So you might think it's a little bit odd to have the title theory in the slides at a real world crypto conference, but let me promise you this is about a very real world problem. So consider the problem of password storage. In particular, suppose that an attacker manages to break into the server and steal the hash value. Now we can try an offline attack in which he compares the hash value against the hashes of multiple passwords in a large dictionary. And these attacks are increasingly commonplace. So this was my wall of shame as of a couple years ago. And this is the wall that just keeps growing and growing and growing. So now I can claim that billions of user accounts have been affected by these attacks. Not only is this a common problem, it's an increasingly dangerous problem. So if you log into Amazon right now and search for the Antminer S9, you could purchase this machine for about $3,000. And this machine can compute 14 trillion SHA-256 hashes per second. That's a lot of password hashes per second. My guess is that most user passwords are not going to stand up to this level of attack. In particular, this is what the user password distribution looks like at the moment. People continue to select weak passwords, despite decades of warnings, it's difficult to create and remember high entropy passwords. So we seem to be stuck in a world where users continue to select low entropy passwords. So this motivates the goal of developing moderately expensive hash functions. So we have two kind of contradictory requirements here. We want a function that can be computed fast on your own personal computer. And we also want a function that's expensive for the adversary to compute, even on customized hardware that he might purchase, like that Antminer S9. So one of the promising techniques to achieve both goals is memory hard functions. In particular, memory costs tend to be equitable across different architectures. The cost of building in a gigabyte of RAM on an ASIC is not dramatically lower than the cost of just purchasing a gigabyte of RAM for your personal computer. All right, so let me now introduce data independent memory hard functions. First, let's talk about memory hard functions. So the intuition, what's a memory hard function? It's a function whose computation costs are dominated by memory costs. So to compute a password hash once, we want to force the attacker to lock up a large amount of memory for a long period of time. So in particular, we'd like the attacker to lock up all this memory as opposed to just using hash iteration and locking up a very small processing unit for a short window of time. So Scrypt is one classical example of memory hard function. And the next talk will actually give some exciting new positive results for Scrypt. One of the downsides with Scrypt, though, is that Scrypt induces a data dependent memory access pattern. That means that the pattern in which memory is accessed depends on the sensitive user input, in this case, the user's password. Why is this a problem? Well, it means that potentially the password is vulnerable to side channel attacks, for example, cache timing attacks. So a data independent memory hard function is simply a memory hard function whose memory access pattern does not depend on the secret user input. And in this case, if we adopted data independent memory hard function, then we don't have to worry about these side channel attacks. All right, so formally, what is a data independent memory hard function? Well, a data independent memory hard function is defined by two things, directed acyclic graph G, which specifies data dependencies during computation, and a compression function h, which will treat as a random oracle in our analysis. So given a graph as follows, how do we compute this function? Well, the input is the password and the salt. The label of the first node is just the hash of the password and the salt. The label of an internal node is the hash of its parents. So in this case, the label of node three is the hash of label two and label one. And finally, the output of this function is the label of the last node, the sync node. All right, so to talk about computing data independent memory hard function, we can use the language of graph pebbling. In this case, placing a pebble on a node means that we compute the corresponding data value and store it in memory. Removing a pebble from a node means that we free that value from memory. And of course, our goal is to place a pebble on the last node, compute the final value. Of course, there are rules that guide the pebbling. In particular, we can't just place a pebble on the graph at any point in time. We can only compute a value if we have all of the dependent data values in memory. So we can only place a new pebble on the graph at time step i if at the previous time step we had pebbles on both of its parents. All right. And of course, the final requirement is that we have to finish pebbling this graph eventually. Our goal is to compute the output. All right, so here's an example of pebbling. Very simple. You start off with no pebbles on the graph. We can start off by putting a pebble on node one. Now we can place a pebble on node two. Now we can place a pebble on node three. At the same time, we might want to free up these two values to save memory. Now we can place a pebble on node four. And now that we have pebbles on nodes three and four, we can place a pebble on node five. All right, pretty straightforward. All right, so recall that our goal was to force the attacker to lock up a large amount of space for a long time. So how do we formalize this requirement? Well, the first attempt is space-time complexity. So we say that the space-time complexity of a particular pebbling strategy is just the number of pebbling steps multiplied by the maximum number of pebbles on the graph at any point in time. This is kind of the space that you use. All right, so this is a nice notion and it has a rich theory, but I claim that this is not an appropriate metric for password hashing. Why do I claim that? Well, the problem is for parallel computation, ST complexity can scale badly in the number of evaluations of a function. So suppose, for example, we have a function where we need a lot of space initially to compute the function, but we only need a lot of space for a very short period of time. So in particular, our space looks like this, this blue curve. Well, in this case, the ST complexity of the evaluation strategy is quite high. But if we wanted to evaluate multiple instances of the function, we could simply pipeline and evaluate multiple instances in parallel without increasing space-time complexity. In particular, there exist functions where you can evaluate up to square root n instances of the memory hard function without increasing space-time complexity at all. All right. So this motivates the need for cumulative complexity as defined by Alwin and Serbanenko. Cumulative complexity is just the integral under this curve. So it's the sum over all pebbling steps of the number of pebbles on the graph at that point in time. What's nice about this metric? Well, a couple things. The first thing is amortization. So the cumulative cost of pebbling two independent instances of the graph is just two times the cost of pebbling one instance of the graph. So this means that the attacker's costs are going to scale with the number of password guesses that he wants to try. All right. So this is a nice metric. How does it work? Well, remember our previous pebbling here. What would the cost of this pebbling be? It's just 1 plus 2 plus 1 plus 2 plus 1 or 7 total. All right. Another reason why this notion is nice is because there's a nice equivalence established by Alwin and Serbanenko again. Informally and at a very high level, high pebbling complexity of G implies that the original memory hard function has high amortized memory complexity. So it's efficient to just reason about the structure of the underlying graph to prove security of the underlying IMHF. All right. So that's cumulative complexity. Since this is real world crypto, we'll move one step farther and actually introduce some other constants that theoreticians generally don't like to think about. In particular, we not only have to allocate space to evaluate the function, we also have to have some cores on chip to evaluate this hash function. So here, this constant R here is just the space required to store a random oracle core on chip. So at a theoretical level, this doesn't change the asymptotics of the function, but it'll bring us closer to reality when we evaluate these functions. All right. So now that I've told you how to evaluate the security of an IMHF, let's think about pebbling algorithms. And first of all, I want to think about the pebbling algorithm which would be used by the honest party. We'll call this the naive pebbling algorithm. And because the naive pebbling algorithm is run by an honest party, typically we expect this algorithm to be something that you could run on a sequential computer. So the constraint is only one new pebble can be placed on the graph per round. The attacker doesn't operate under the same constraint, but the honest party needs to operate under this constraint. All right. So an example of a naive pebbling algorithm is just to pebble the graph in topological order, node 1, node 2, node 3, et cetera, and never discard pebbles. So how long does this take? It takes n steps, and we have an average of n over 2 pebbles on the graph at each point in time. So the expected cost, or the cumulative energy cost, is going to scale with n squared. All right. So what does it mean to have an attack on a data independent memory hard function? Well, we call an algorithm an attack. If the amortized complexity of computing this function is lower than the cost of the naive algorithm. So an example here, suppose algorithm A evaluates five IMHF instances with total cost 100, and suppose that the naive algorithm costs 40. Well, in this case, the quality of our attack is just 2. All right. So what properties do we desire for an IMHF? Well, for practical reasons, we want a graph with constant indegree. We also want to assure that any attack A has small quality, less than or equal to C for some hopefully small constant C. And we also want to ensure that the naive algorithm is somewhat expensive. Why do we want this third constraint? Well, it tells us that memory cost should dominate. And also remember that users are impatient. So n, the running time of the algorithm, is fixed. So we want to make this function as expensive as possible, given a bounded running time. So we want to maximize cost for fixed running time n. All right. And we'll say that MHF is C ideal if it satisfies all three of these constraints for some constant tau. All right. So those are the desirable properties for an IMHF. Now let me tell you about an attack on existing IMHF candidates. So in particular, the main takeaway from the talk is that there's a combinatorial property called depth robustness, which completely characterizes secure data independent memory hard functions. In particular, depth robustness is necessary and also sufficient for building a secure IMHF. All right. So what is this property, depth robustness? Well, at graph G is ED reducible if there exists a subset S of vertices, such that the subset size m of s e, and removing these nodes from the graph reduces the depth of the graph to D. And by depth of the graph, I mean the length of the longest path after removing these nodes S. And of course, if a graph is not ED reducible, then we say it's ED depth robust. So a simple example here. Here's a 1, 2 reducible graph. That means we can delete one node and reduce the depth to two. Pretty easy to spot here. Just delete node three. And it's easy to visually verify that any path has length two. All right. So now I claim that we can attack any ED reducible graph. How do we do that? Well, the only thing we know about the graph is that it's ED reducible. So as input, our attack is going to just take subset S of nodes, which reduce the depth of the graph. And the attack works in two phases, light phases and balloon phases. And the goal of a light phase is to make a lot of progress, to pebble the next G nodes that we want to pebble. And during a light phase, the intuition is that we're not going to keep many pebbles on the graph. We're going to discard almost every pebble from the graph except for nodes in S. And for pebbles on nodes of the parents of the guys that we want to pebble in the next G steps. All right. So we use a low memory. And this phase lasts for a long time. G is going to be typically large. All right. Of course, at some point, we're going to run out of time, and we're not going to have pebbles on the parents of the next nodes that we want to pebble. So now we have to execute a balloon phase to recover all these missing pebbles. And the key point here is that because the depth of the graph is small, we can execute a balloon phase and very quickly recover all the missing values that we've previously discarded. So a balloon phase is expensive. We're going to be operating in parallel and placing a lot of pebbles on the graph. But the point is that it's over very quickly. You execute for D steps, and you recover everything that you've discarded, and then you continue on your way. All right. So our theorem is that if your graph is ED-reducible, then there's an efficient attack A with the following complexity. So this is a complicated term. Let's walk through each component bit by bit. Well, we keep pebbles on the set S. And the set S has size E. So we're going to pay cost E times N to keep E pebbles on the graph for N rounds. During the light phase, we also maintain pebbles on the parents of the next G nodes that we want to pebble. So there's indegree is delta. So we have delta times G parents. And we keep these pebbles on the graph for N rounds again, so delta times G times N. And this last term here is the cost of a balloon phase. So we need to execute N over G balloon phases in total. And the length of a balloon phase is D rounds. And the maximum number of pebbles on the graph during each balloon phase is just N, the number of nodes in the graph. So this upper bounds the cumulative space complexity. And these last two terms, just trust me, they're the cost of querying the random oracle. All right, so we have this complicated looking bounds. Now if we tune parameters appropriately, we get the following energy complexity. And note in particular that if E and D are smaller than N, that this gives us an attack. This gives us an algorithm to evaluate this function with cost a little low of N squared. In particular, this is bad because we want to ensure that any attack requires cost N squared. OK. So now we have an attack, a generic attack on any depth reducible graph. The question then is, are existing IMHF candidates based on depth robust tags? So in this talk, I'll consider a few different IMHF candidates. There's Katana, an entrant to the password hashing competition, which received special recognition. There's Argon2, the winner of the password hashing competition. In particular, Argon2i, the data independent mode is the recommended mode for password hashing. There's a newer proposal called balloon hashing. And the original paper had three variants. I think there's just one variant in the current proposal. OK. But in summary, the answer is no. None of these graphs are depth robust. So Katana is actually kind of maximally depth reducible. So if I remove E nodes, I can reduce the depth to N over E. That's kind of as bad as it gets. And the consequence of this is that the cumulative cost of computing this IMHF scales as, oh sorry, end of the 1.62. This is in the exponent here. This isn't order 1. It's not constant time, but end of the 1.62. All right. So balloon hashing and Argon2i are also depth-reducible, slightly better than Katana, but still depth-reducible. And we get an attack which has cost scaling as end of the 1.71. The latest version of Argon2i actually seems to be a little bit better, but it still is ED-reducible. And in particular, the cumulative complexity scales is end of the 1.77. In any case, none of these are close to N squared, which is what we ideally want. And of course, the same general techniques apply to a host of other IMHF candidates. So there's some other follow-up work looking at Pomelo and other variants from the password hashing competition. But let me focus on Argon2i, since it's the winner of the password hashing competition. So what does Argon2i look like in terms of its graph structure? Well, you start off with a chain, 1 through N. And then for each node i, you pick a random predecessor, r i. In the original version, this random predecessor is chosen from the uniform distribution. In the newest version, it's chosen from a slightly more sophisticated distribution. But it turns out not to matter too much in terms of the performance of our attack. So here's how you would, for example, reduce the depth of an Argon graph to root N. First, we're arbitrarily going to partition these nodes into layers. Each layer has end of the 3-force nodes. And now we're just going to delete any node in this set, S2. So S2 is basically all nodes with a predecessor in the same layer. So the claim, I'm not going to prove it, but just trust me, S2 is pretty small. Well, what happens after you remove S2 from the graph? Now each layer essentially becomes a path. And it turns out that it's pretty easy to reduce the depth of a path. So now all we'll do is we'll reduce the depth of each layer since it's a path. Once we've reduced the depth of each layer to, let's say, 4th root of N, then any path through the whole graph can stay in a single layer for at most 4th root of N steps. And there's 4th root of N layers, so the total depth of the graph would be square root N. All right, so that gives us a tax in theory. Of course, this is a real world crypto conference. So the question is, are these just the tax in theory? Or could they lead to practical attacks for real memory parameters that we might adopt? And we would argue that the answer is yes. So in particular, for argon 2i with memory parameter 2 to the 20, so that's a gigabyte of memory, which is practical. We get attack quality exceeding 5, and it's going to rapidly increase. So if you go up to just 2 gigabytes of memory, attack quality is going to be almost 10, and it's going to further increase as memory increases. So to get this plot, we actually simulated our attack. So instead of just plugging in the theoretical bounds, we actually implemented the attacks, generated some random argon 2i graphs, and just ran the attacks to see what performance looks like. And this is what you get. Now, I should mention I've had a graduate student working this semester on some alternate heuristics, and I think these curves are actually going to shift left a little bit farther. I don't have those results yet, but if I had to make a wager, I think these curves are going to shift a little bit left, which means that the attacks are going to be even more practical and at even smaller memory parameters. All right. So I should also note that even with the pessimistic parameter settings, so if you make six passes through a gigabyte of memory, we still get attacks. So you can still reduce your cost by a factor of 2, approximately. Of course, there are good reasons why you might not want to make six passes over memory. In particular, users are impatient, and so you probably don't have time to make six passes over a gigabyte of memory. So this doesn't just apply to argon 2i. In fact, we have a general theorem stating that ideal IMHFs don't exist. And simplifying a little bit, we prove that any graph g with constant in-degree is at least somewhat depth-reducible. So in particular, that implies that there's always an attack with quality, oh sorry, this should be log n over log log n, not the other way around. I'll fix that on my slides before I send it to you. But there's always an attack with quality, roughly log n. All right, so this is true in theory. But we actually can't rule out ideal IMHFs in practice. So if you look at the memory parameters for which these attacks start to become practical, they start to be practical somewhere around 2 to the 51, which is orders of magnitude above any real memory parameter that we would select. All right, so in the last three or four minutes, let me tell you about some exciting new results. Not only is depth robustness necessary for a secure IMHF, it's also sufficient. In particular, if g is ed depth robust, then the cumulative cost of pebbling is at least e times d. So that's a pretty simple theorem statement. And since this is a real world crypto conference, I don't have to be embarrassed and hide behind the simplicity of the proof. That's actually the entire proof. I won't walk through it, but it fits in a paragraph. All right, so what are the implications of this theorem? Well, one implication is that there exists a constant in degree graph with cumulative complexity scaling as n squared over log n. So this beats the previous best construction due to all-win in Serbanenko, which gets n squared over log to the 10n. And in fact, we can't really do better in an asymptotic sense based on the last attack that I showed you. I should mention here that the result from this paper, ABP 16, is definitely not practical yet. But we do have some constructions that we're working on, which we believe will be practical. So hopefully, I'll have some updates to share soon. Some other new results. So with this new technique, we can not only upper bound the complexity of computing these IMHFs, we can also provide some lower bounds. So in particular, for the latest version of Argon T.Y., there's a lower bound. The cumulative complexity is at least end of the 1.66. Similarly for balloon hashing, S-crypt, we're going to hear about next lecture. It actually has a lower bound of n squared. So that's exciting. But of course, S-crypt is data dependent. That's why our upper bounds don't apply. All right, so in conclusion, depth robustness is necessary and sufficient for building secure IMHFs. I think the big challenge and one that I hope people will be motivated to work on is improved constructions of depth robust graphs. So the result in our paper really used a result of Aridos, Gram, and Semeretti from 77. They're a combinatorialist. They weren't particularly concerned with practical efficiency. But in this case, constants obviously matter a great deal. More open questions. I think it would be cool to automate the cryptanalysis of IMHFs. Unfortunately, we have some results that suggest this may not be possible. It's NP-hard to compute C-C of G, but we can't rule out heuristic approximation algorithms. And of course, there's still room for a tighter analysis of the latest version of Argon2i. There's a gap between the lower bound and the upper bound. So with that, I'll thank, thanks for listening. Thanks, so several times in your talk, you emphasize that the in-degree delta has to be constant, but surely it only has to be on average order one for performance reasons, right? You only care how many hashes you're doing on average. So like you said, the naive thing that the honest user would do would just be keep all the pebbles in memory and never erase a pebble. So if you just had a graph that was just one long chain and then the last node hashes everything you've ever seen together, right? That gives you a constant average in-degree. And it seems to me that the naive algorithm you suggest is the best you can do there. So what's wrong with that? Okay, so that's a good question. So first of all, if we're talking about average degree, you can still reduce the depth of any graph with average, with constant average degree, the same result would apply. Now that example. Yeah, but you need all the pebbles right at the end. You need n pebbles in memory. Right, so there would be a recursive way to evaluate. So take just the first chain, the n minus one nodes. There would be a way to kind of pebble all those nodes with cost, I believe, n times log n, kind of using a recursive approach. So really your cumulative cost, it's hard to prove this on the fly, but I believe the cumulative cost would actually be about n log n to evaluate that function. Of course this is assuming that you can execute that last step in pseudo constant time, right? If you have a hash value that depends on everything previously, it seems like it's not really accurate to treat that as a comic step, but it's a good question, yeah. Small comment from Argon to designers, in particular for you. Scroll back to the plot with attack qualities. That one. So for real world implementers, some comments. So interesting, on this plot you see Argon two with different parameters. And interestingly, the ones at the bottom with the lowest attack quality are also slower than and take much memory, less memory within the same time compared to upper ones. So if you want, when you increase this parameter tau, in fact, the attack quality decreases slower than adversaries brute force costs. So counter intuitively, it's to maximize costs for adversaries to brute force your passwords, it's better to use Argon two with highest attack quality, but not the lowest one. Yeah, that's an excellent point. So there's two criteria here. There's attack quality, which measures the ratio between the cost of the honest party and the cost of the attacker. And because we're plotting attack quality here, this is hiding what the attacker's true cost is. So if you wanted to maximize the attacker's cost, you're actually going to pick this red line. That means, of course, that the attacker is going to want to run our algorithm because it gives him the highest advantage. But it also means that if you're the honest party and you're deploying this algorithm, your optimal thing is actually going to pick this to select the red option here. Yeah. Great, why don't we thank Jeremy Argon.