 for helping me write this paper. Well, we all wrote this paper together, of course. But anyway, on to the talk itself. What we're going to be talking about is characterizing certain security properties for a class of algorithms. So, up on the screen, you see two different linear crypt programs. It turns out that one of these two programs is collision-resistant, whereas the other program fails to even be second to preimage-resistant. And how do we tell the difference between these two? Well, that's what we're going to be talking about today. So, what are linear crypt programs? So, up on the screen, we can see an example of a linear crypt program. Well, linear crypt programs are a class of algorithms, or a model of computation introduced by Karma and Rosalek in 2016. They lend themselves to powerful analysis through algebraic terms such as span and linear independence. But what are we actually able to do in a linear crypt program? Well, we can actually take inputs to our functions. These are inputs from a field. Traditionally speaking, we can also sample randomly from the field. But in those cases, we're considering those random samples to actually be included in the inputs to the function. We can take field elements and send them through a random oracle to get new field elements outward. Often to make the association between the program and the random oracle explicit, we write pH. We can take our field elements that we get either from inputs or from queries to the random oracle, and we can use some sort of fixed linear combination. And by fixed, I mean fixed with the program to combine these field elements into new field elements. And finally, we can return any of these intermediate values that we get throughout the execution of the program out of the program. These include not only inputs or outputs from the program or outputs from the random oracle, but also linear combinations therein. So linear crypt lends itself to multiple models that actually reason about these programs. We have the more natural algorithmic view, which is just a sequence of operations that one would take to run a program. This may be the more natural way of thinking about it. We also have a graphical view in a DAG form. The reason that we don't have cycles are because of the calls to random oracles. And finally, we have the algebraic view, which is the more important view for our paper and for reasoning about the properties of course, it being an algebraic view. So previously, in Rosalek and Karma's work in 2016, they looked at when two random linear crypt programs are indistinguishable from each other in terms of their output. But what about collision resistance or second pre-image resistance for these type of algorithms? Well, as it turns out, all of both of these properties, not only collision resistance, can be characterized in terms of linear algebraic properties, as I said earlier, such as span and linear independence. And although our paper produces concrete bounds that are different between second pre-images and collisions as we'd expect, asymptotically speaking, the two are equivalent for these programs. So before we get into the nitty-gritty, what do we mean by second pre-images or collisions in linear crypt? Well, as it turns out, it's exactly as we'd expect from the natural constructions. Basically, given some output from a linear crypt program and the pre-image or the set of inputs that generate it, an adversary, or maybe not, gives us some different set of inputs or a different pre-image such that when we run it through the program, we get the same output So let's see what it looks like finding a collision or a second pre-image inside a linear crypt program. So before we run through this, I'm just going to say we have a special recipe for cooking up collisions in second pre-images that we're actually going to follow in this example. So first thing we're going to do, of course, is show the two sets of execution. On one side, we have the algorithmic run, and on the second side, we have a set of internal variables, the graphical run. So, since this is a collision or second pre-image, our outputs need to be the same between both of the runs. Because of this, V5, one of our outputs, needs to be the same in both runs. And due to the oracle constraints imposed by the query on V3, which gives us V5, since V5 is now fixed, V3 is now fixed. And because of that, V3, one of our input vectors, would be the same between both of the runs. Now, following our special recipe, we're going to choose V6 to be chosen arbitrarily but different between the two runs. Then, because of our oracle constraints, V7 needs to be different between the two runs, since the oracle constraint imposed by taking the oracle query of V6, giving us V7, means that the outputs need to be different between the two. Then, since we now know that V8, one of our output vectors, is the same between both runs, and we know what V7 is, V1 is now uniquely determined by the system, and we can solve backwards using linear algebra to determine V1 on both systems. And as it turns out, the V1 will be actually different between both of the executions because of how we chose V6 arbitrarily. Now, we've actually found one of our input vectors that is different between both runs. Finally, we can use the information that we have on V1 and V3 and V6, and some linear algebra to intelligently solve backwards for V2 in both of the runs. Now, what we have at this point is we have, by construction, different inputs to our linear script program that produce the same outputs or a collision. Now, the way that we've done this following our special recipe actually allows us to find second pre-images the exact same way. So, this is the general idea of what we did and a kind of outline for the recipe that we followed. We identified some set of oracle queries that were the same between both of the runs. We also identified some oracle query that was different between the runs. It's what we call our special oracle query. Then, using the information that we had, we solved backwards for all of the internal queries. And, incidentally, it turns out that we also solved backwards for all of our input vectors, which, like our oracle queries, can determine the entire system. Now, the algorithmic view or the graphical view may be more natural to look through. It's powerful, and it doesn't lend itself to algebraic analysis. So, we need to start easing ourselves into the algebraic representation of these linear crypt programs. So... Oh, it looks like that fall off. Okay. As we can see, we have the base variables highlighted. Now, what do I mean by base variables? Base variables are any variables in the run of a linear crypt program that either come in as input oracle query. All the other values inside of a linear crypt program are actually just linear combinations of these variables, so they're the ones that we actually want to keep track of. And, to separate the outputs of the linear crypt program from the rest of the system, we keep all of the outputs of the program inside of a new matrix called M. But there's something missing in this representation. There's some sort of internal structure that we don't end up capturing through this simple algebraic representation. And that is the relationship imposed by these oracle queries. That's why we have the full algebraic representation, which you can see on the right side of the screen with C and M. Now, M is just the same in both of our... in both of the examples. It just holds the outputs of our program. However, C is now a new matrix which represents the... not only the base vectors of our system that are important to us, but also represents the relationships imposed by oracle queries in our program. Turns out that this actually allows us to uniquely construct a linear crypt program. So, as I was saying, it helps represent the relationship between oracle queries. As we can see here, the first line of C shows us the relationship between v1 and v4. Now, what if we were given a... what if we were given a collision? We were given two inputs that generated the same output. Well, we can run these two in parallel side-by-side and see what kind of happens during the run. So, the first thing we can do is identify base variables shared between both of the programs. This can be done just by running through the programs themselves. Now, this isn't necessarily going to happen. We cannot be base variable shared, but if there are, we can identify them now. Finally, we can identify a base variable that is different. We're guaranteed to have a base variable inside the system that's different because the inputs of the... the inputs of this system are different, but the outputs are the same. Turns out that this... this base variable right here, the corresponding query, the inputs of that query must be linearly independent of the queries that we said otherwise, the output would have been uniquely determined to be the same in both runs. Finally, we can solve backwards using linear algebra until all of our queries are defined. But what might go wrong here? Now, what happens if we have this type of constraint where we expect v7 to not be fixed or we expect v6 to not be fixed? But it is. What if we... how do we get around this? Maybe if we had an order in which the oracle queries were set up, we could be able to forgo this problem. So this is where we get that special recipe coming in. So a collision structure is a... is one. A ordering of C or the ordering of the queries of the program and some sort of special query I, this... the special query that we start the chain reaction off of. As we said earlier, the inputs to I cannot be in the span of the queries before it or inside the total output. Otherwise, it would have been marked as green or marked as the same in both runs and we'd have to choose a different special query. Finally, to make sure that we don't get stuck at any point of the run, we impose this constraint on the collision structure. Now, this is the general form of the recipe that we saw at the beginning of the talk. But what if we made a small change? Well, it turns out that our collision structures are actually the special sauce that we were using to cook up these collisions or second pre-images. So really fast, here's a run. Without loss of generality, we can say that the attacker will make all queries inside both the original run and the second pre-image run, so we can have those set up in parallel. Which queries are the same between both runs? Well, all of our outputs are going to be the same between both runs. We can immediately mark those as green and everything that leads to those outputs directly, we can then work backwards from our green. So what is our special query here? Just following the run of the adversary. Well, it turns out that the first non-green query or the non-same query that the adversary makes in both the runs is our special query. Finally, we can use all this information we have to work backwards to determine all of our inputs and find our collision structure. So here's the conclusion. As it turns out, collision structures in a program the lack of collision resistance, the lack of pre-image resistance are equivalent, modular degeneracy of course, degenerate programs. So turns out that collisions, second pre-images and linear programs can be boiled down to algebraic terms such as span and linear independence. These properties of span and linear independence lend themselves to polynomial time algorithms. Actually, in our paper, we present a polynomial time algorithm for finding collision structures and arbitrary linear programs and now that we've shown that collision resistance, second pre-image resistance are equivalent, we can actually simplify our security parameters for these type of functions. So one thing I kind of skipped over a little bit is that all of our functions all of our, sorry, it's a linear program class we're looking at are ones with nonces basically given to each of the oracle queries. This is actually really important. On the top we have a collision resistant run and on the bottom we have a run that fails to be collision resistant. There's only differences that one has nonces and one doesn't. The reason being is that the attacker can intelligently choose inputs to the function y equals a call of the oracle on x which effectively expands the program's output and it turns out that determining these properties for these type of functions is actually NP-complete. Finally, this was in the random oracle model, however it should be easy to expand it into the ideal cipher model. Thank you. We have time for questions. So can you say how you deal with degenerate programs? Oh yes, of course. So degenerate programs are a special class of programs where one of the things I said earlier in the talk was that since we had different inputs and the same output, there must be some internal state that differs. Well, degenerate programs are the one thing that breaks that rule. Basically these programs are so simply terrible that they fall out of the scope of the collision structure a result. Basically, here's a good example. We have an oracle query on X with a linear combo with Y but as it turns out this actually gives us an entire space of collision inputs basically anything that I'm going to put heavy quotations around this sums to the same input to this oracle query. Are there any other questions? Okay, if not then let's thank the speaker again.