 Memory trade-offs for Neo-Collision by Gaetano and he's going to present them. Thank you. So we need to talk. Can you hear me? Can you hear me? You can hear me but not to the mic. Okay. Like this? One, two. One, two. Hello? Hello? Okay. Can you hear me? Hello? One, two. Okay. I guess it's going to do. Is that okay? Okay. Okay. Okay. So I'm going to talk about time memory trade-offs for Neo-Collision. And by Neo-Collision, I mean generic algorithms for Neo-Collisions. So first, let's talk a little bit about hash functions. So this is going to be about hash functions. The ideal model for hash function is a public random array code. So what this means is that it should take any kind of document as input and give as output a random string. And for every new document, you should get a new random output. But of course, if you get twice the same input, you should get twice the same output. That's what you try to achieve with a hash function. So of course, it's not really possible to have this kind of array code thing. What we will get is just this function. So we have some more concrete security goals, which are pre-image attacks that can pre-image attack and resistance to collision attacks. Those are the main three security goals that we look at when we build a hash function or when we analyze a hash function. But those three goals are not really enough because hash functions are used in many, many different settings using many different assumptions. So we also have to look at some more security objectives like security, multi-collision resistance, resistance to birding attacks and so on and so on. And in particular, one notion that we sometimes look at is the notion of near-collision. So what is a near-collision? Well, it means trying to find two different messages so that the having distance between the hash of those messages is small. So by small, it's smaller than some fixed amount. So it's interesting because it's a kind of generalization of a collision attack. Instead of when collision attack would be a near-collision with distance 0 and a near-collision, we allow some more freedom. If we look at cryptanalysis for a near-collision attack, if we look at attacks, they often use similar techniques to expand a real collision attack. So it's a good idea to measure the security margin. We can allow a bit more freedom to the adversary by letting him do near-collision attacks instead of full collision attacks. So sometimes when you have a near-collision attack, you can turn it into a real collision attack. So it doesn't happen always. But sometimes at push-a-one, we don't know how to make a full collision attack in one run, but we can use several blocks to turn near-collisions into full collision attacks. And in fact, if you look at the literature, there are many attack papers which consider this notion of near-collisions because it allows them to go through a few more rounds. So before all those papers doing a near-collision attack, the natural question is what would be the generate complexity of a near-collision attack? So if someone claims they can find a near-collision attack for a Sky-512 with 100 active bits and it costs 2 to the 100, should it be considered as an attack or not? Well, this depends on what's the generate complexity of this kind of attack. And that's what we're going to discuss in this talk. So what is the generate collision finding of those near-collision attacks without looking inside the compression function? So if we look at the state of the arm, there are of course several results known about this. Okay, so we know that there is a lower bound of complexity of a near-collision attack and we know that there is a simple algorithm to reach it, but this algorithm needs a lot of memory. So on the other hand, we have memory-less algorithms based on location or based on coloring codes. But those algorithms, of course, are more expensive than the slower ones. So there's kind of gap in between and that's going to be a topic of this talk, trying to kind of bridge these gaps. So first, let's talk about this lower bound. So it's a very simple analysis. If we have a hash function, we compute the hash function i times on some inputs. This gives us about i squared pairs of our input, of course. And then if we look at each pair, there is a certain probability that it will be a near-collision and then we can derive the lower bound and this lower bound is just here. But we can see that near-collision attacks are easier than collision attacks, of course, this is expected. And the factor is a square root of bw of n where bw of n is the size of a hamming ball of Francis W, meaning the number of words that are close enough to the zero word. So let's have a lower bound. Now, of course, it's relatively easy to reach it. What you just have to do is compute the hash function the number of times, the same number of times as down, of course. And then you look at each pair of output and you compute the hamming distance and hopefully one of them would be small enough to be a near-collision. So the good thing about this simple algorithm is that you only compute the hash function i times, so you're going to reach the slower bound. But the bad thing is that you have this loop here where you have to go over each pair of output. So you have something in i-square, you have i-square comparison and i-square memory access. In fact, this i-square will be bigger than to the other two. So in practice, if you plan on this algorithm, it's going to be less efficient than a full-collision algorithm. So it doesn't make a lot of sense in practice. So that's the main problem with this. You need a lot of comparison, a lot of memory access, and also a relatively large memory. So to overcome this, I've been trying to do memory less than a full-collision. And the reason for this is that if we look at the collision attack, of course we know that there are very efficient memory less algorithms, which are just as good as an algorithm using a lot of memory. So the idea is, can we do the same from a full-collision? Could we build a memory less algorithm that would be as good as one that chooses a lot of memory? So first, how does it work for collisions? A very well-known algorithm would be called Rows. And it looks something like this. What you do is you iterate the hash function. So you start from some random point, and you just hash it, and then you hash again, and again, and again, and again. And you're going to build a list of points. And when your list gets big enough, you will have a collision in this list. And then if you look at the graph, you're going to iterate, iterate, iterate. At some point, you reach a point that you have already seen. And then you're going to cycle in this loop. And then you can use a variety of algorithms to detect this. So if you want to use this same idea from your collision attacks, it's not going to work because the main feature that we use for those memory less collision algorithms is that if you start from two random points, and you iterate chains of computation from those, if there is a collision at some point, if you keep iterating, the points will still be colliding. So you can detect a collision later than when it gets big enough. So you're going to a collision later than when it actually happens. So that's a very important feature for memory less collision attacks. If you look at near collision now, if you start from two random points, you iterate, and at some point, you have a new collision here. But then if you keep iterating, they will not stay close to each other. They're just different. So then you get different problems. Then they will be far away. So you cannot do this trick with another kind of techniques to find near collision without a similar requirement. So there have been two main ideas proposed to do this. So the first one is to use truncation. So it's very simple. You take your hash function with n-bit output, and you just truncate off some part of the output. Now you look at this truncated hash function, and you look for collisions with the truncated hash functions. So finding collision is easy. We can do it without memory. And when you have a collision, in the truncated function, you look back at the full output, and you will just have those extra bits that have been truncated, which will be basically random values. And for instance, if you just truncate W bits, then you have only W bits that you don't control. So there will be at most W activities. This will give you a near collision with distance less than W. So this is good because the complexity of this very simple algorithm is now 2 to the n minus W over 2. So it is now more efficient than a basic collision algorithm. So this is if you just truncate W bits. Now you can also truncate more bits. If you truncate, for instance, 2W plus 1 bits, then you have less remaining bits. So the collision attack with those remaining bits will be more efficient. And if you look at those bits when you put them back, if you have 2W plus 1 bits and you truncate 1 half, only W of them will be active. So now you just have to repeat this twice and every other time you will get near collision which is close enough. Now the complexity is much better because you have n minus 2W minus 1 in the exponent and you just have up to 2 at the end. Obviously they are not more efficient. You can do it in a more general way. You can say let's truncate tau bits when you truncate tau bits, you have to find collisions in n minus tau remaining bits. And then you have to look at the probability that those tau bits will have a distance of less than W. When you do this through analysis, there is a very nice paper by Nanda-German Teufel which studies exactly this situation and they show that the optimal value of tau to do this is about 2 plus square root of 2 times W minus 1. When you do this, you get this complexity. Let's say we are not using truncation in a memoryless setting so the optimal complexity is something like this. Now if we look at it in another way what we are doing basically is that we are trying to build a function f so that if f of x is equal to f of y then x and y are close enough. That is what we do when we truncate. We truncate then we find collisions and we say that if there is a collision in the truncated version, then when we look at the full version it will be somewhat close because of course all the non truncated bits are alighted. When we look at it in this way really the natural thing to do is to look at covering codes because this kind of notion here is very natural in a setting of error correcting codes and then what you are looking for is a covering code. If you have the coding function f for this, now you just look for collisions in f of h and when you have a collision with f of h this implies that you have a W near collision. And the direct setting is if you have a covering code with ridges R then you get a near collision with distance at most 2 hours. This approach usually gives better results. If you use a covering code you can get better complexity than just choosing location. So let's go back to the outline of the stock. So now we have complexity of lower bound for best complexity for memoryless algorithms. So our goal now is to build something in between and how we are going to do that we are going to look at time memory trainings. Because if you are actually going to implement this kind of attack so you are not going to use the memory full algorithm because it uses just too many memory and there are too many memory assets so it is not going to work. But you don't have to go all the way to memoryless. So when you do this kind of computation usually you have some memory available. If you are going to use a cluster you have quite a lot of parameters available. If you use GPU you also have a large amount of memory. Whatever the machine you are going to use you probably have some memory. So now what we try to do is use this memory to improve the complexity. So let's look at time memory trade-off. So if we look back at truncation algorithms when we are trying to look to find new collisions using truncation so what we do basically we choose some parameters with tau and truncate tau bits and we look for collisions in truncated hash functions. So if we look at how things evolve when we change tau the more we truncate the more collisions we are going to need to find a small enough distance but each collision becomes cheaper because you have less remaining bits that you are trying to collect. So now when you truncate many bits you have a lot of collisions and then how expensive is it to find a lot of collisions? So if you do it in a memory-less setting then finding i collisions will cost i times 2 to the n minus tau so you just have to repeat the collision attack i times. That's if you don't use any memory but if you have some memory available then what you can do is when you have the first collision you can keep some state that will help you to find the second collision faster than the first one and if we do something like this that before algorithm would be more efficient because we can basically what we are going to do is pick a larger tau so that we need many collisions but we will get each of them more efficiently because we use some memory so we can get below this figure here. So how does this actually work? Well there is a very nice paper by a farmer shot on dinner where they talk about parallel collision search and basically what they do is they use an approach based on distinguished points so we say that the point is distinguished if it has some number of trains in zeros and then you are going to build chains of iteration you iterate, you iterate, you iterate and when you reach distinguished points you stop then you store this value x and this value y and you repeat this, you repeat this at some point you will get a value y that you already had seen before and then this means that two chains are colliding and you start again from the beginning and actually you have a collision but when you do this if you look at the complexity analysis the cost of finding i collisions will be less than i times the cost of finding one collision but more precisely if you look for a relatively small number of collisions meaning you have more memory than the number of collisions you want then you have a speed of a square root of i which is optimal on the other hand if you don't have enough memory if you want more collisions and the size of your memory then the speed of this is about a square root of n now if you want to combine those to get a full expression that's added everywhere what we did in our paper is that we just sum those two expressions and what happens is that when i is much smaller or much larger than n then one of those expressions becomes negligible so we get a decent a good value and when we are in between the size of n we did some experiments and this expression here is relatively accurate so we have a good evaluation of the complexity of finding i collisions now when we plug this into the simple transformation based algorithm we can compute the complexity of this algorithm using a time memory trade-off and if we do the analysis of the complexity we can see that for small values of tau the complexity is decreasing but for large values of tau the complexity is decreasing so this means that the minimum will be somewhere in the middle and in the middle that means when i is about the same as n and then the complexity will be about this value here so to get over 2 over a square root of w of tau that's the complexity of this transformation based algorithm when we use the time memory trade-off it's actually much more efficient than in the memory analysis so to see a little bit how this works here are some examples so i'm looking at a 10-year collision in a 128-bit hash function so that could be md5 or it could be shaft-reduced into 128 bits or whatever you want now if you look first at previous works so there is a lower bound that i explained in the beginning and in this case the lower bound is about 2 to the 14 and then the best known algorithm is using coloring codes and it has a complexity of 2 to the 52.5 we can also show that any covering-based algorithm has a complexity of at least 2 to the 50 now if you look at previous translation techniques we have a complexity of 54 or 53 so it's above the covering code approach now when we use the time memory trade-off we just when you get back the memories or when you get back these days pretty small maybe you even have this kind of memory on your cell phone and if we use this then we can go down to a complexity of 2 to the 47 so it's much better than previous algorithms and we we can get relatively close to the lower bound much closer than previous so now let's go to the second contribution of this paper the first one was time memory trade-off and now another thing we do is try to combine the truncation approach and the covering code approach so how can we combine those two techniques well it's actually very simple what we do is just first we truncate the hash function and then what we're going to do is instead of looking for collisions of the remaining bits we look for near collision of the remaining bits so it may seem a little bit strange so I'm saying how do you find near collision when you truncate and you find near collision so how do you find those near collision now so of course you're not going to truncate again that would make sense but what you can do is use a covering code now so first you truncate and then you use a covering code to find near collision of the remaining bits so how does that work well you have to choose some parameters, you have to choose tau how many bits you truncate and you have to choose r, radius, the covering code which means how many bits will be in this part and when you look at the full near collision at the end with active bits both in this part here due to the covering code and in this part here due to the truncation so you have a relatively large parameter space you can choose both r and tau and if you look at special cases it actually covers all previous algorithms if you choose tau equals to 0 I mean you're not truncating anything then you just have a a covering code like was proposed before on the other hand if you choose r equals to 0 then you're not using a covering code you're just using truncation so this algorithm is more general than all the previous one so it will be at least as good as the previous algorithm very simple what we're actually doing when we look at the complexity of this algorithm which if you try to do an analysis the bad thing is I don't know how to give you a nice expression of what would be the optimal tau or the optimal r I don't have the kind of analysis I did earlier for just the truncation case so that's the bad thing however the good thing is it's relatively easy to compute the complexity if you fix some value of tau and some value of r so you can look at the vapor from the details there's actually a piece of code that will just do the computation for you so now what you can do is just try all possible values of tau, all possible values of r and just pick the best one that's how we're going to do and now you can just compute depending on the amount of memory you have depending on the value n the type of size of your hash function and the value w, the number of bits you want in the end you can just compute this with all possible tau and you get the best one you get those complexity figures here and you can see that it's usually much better than previous algorithms you can get much closer to the lower bound and something interesting is that usually you have both truncation and recovery codes that are involved so we're not in one of those degenerate cases here we are usually somewhere in the middle but that actually gets better algorithms that's using just truncation or just recovery codes so that's the end of the talk so to summarize what we did in here so the first contribution is that we propose a way to use time memory trade-offs with a truncation approach and the basic idea is use a larger tau, so truncate more and we need a lot of collisions and we use a nice time memory trade-off for the night for lunch or dinner and we can find high collisions more efficiently than for a cost of high times to be at number 2 so that's the first result and the second one is that we showed how to combine truncation approach and recovery code approach in a very simple way, just truncate and then use recovery code to find new collisions in the remaining parts and all of this leads to a relatively significant improvement over the complexity for instance if you look at 10-year collision for MD5, so there's actually an example of the papers we did in the middle of this and if you look at previous algorithms the best complexity was 2252.5 and now we can go down to 45.2 which is significantly better and we can kind of reduce the gap by about 5 to 2 compared to the lower bound so that's the end of my talk thanks for listening and thanks for being up so early we have time for a few questions any questions? I have just one so since you try to compare without memory and with memory it shouldn't be, now since you really do a precise analysis taking accounts, how much really times it takes to access the memory depending on the type of memory for example apparently you applied to MD5 so do you see the numbers changing a bit because you have to access I guess wrappers how much RAM did you use? Yes, what we used for for example was RAM of 1TB but in this example it doesn't matter that much because we don't access the memory very often so the access time to the memory doesn't really come in another thing you could do is try to use disk memory but if you don't really look at this that could also be a way to improve things a little bit but then okay the effect would be to improve the memory less I guess in your example if you take an account that takes a few cycles to Yes, if you do look at the time to access the memory that would probably cost a little bit more but I think in most of the parameters we have relatively few accesses to the memory so it would be negligible Okay, one more question and thanks the speaker again