 Welcome back. So one second before we start the talk, because I want all of you to be able to enjoy the practical session, which I have spent a whole of yesterday trying to set up. But I think it's quite a lot of fun. And I think you might have quite a bit of fun with it. So I'd like you to try to install Docker, which you can just search for it online, like Docker. And if you use Windows, then Docker, Windows install or something, if you use Linux, then it's sudo apt-get install Docker. If it's Mac OS, then you can search for Docker install Mac OS, and you will find the instructions. And it's usually a point-and-click exercise. And once you have installed it, then you can go into a console. For example, in Windows, it's the command prompt. In Mac, it's actually called a console with a C. And in Linux, it's the terminal. And you just type in this command of docker pool msauce slash crypto minisar double colon winter school. And it will download this Docker image. It's about 50 megabytes, so it really shouldn't be too difficult to download. But the other option you have is you build it from scratch using GitHub. But I've done that as well. So with someone here, we did it in less than five minutes, I would say. So it's not a big deal to do. If you get stuck, then just talk to me during the next break. And then I can sit down with you for five minutes. And then I'll just make sure that we build the system so you can use it. Should we talk to run? You can do pull or run. It doesn't matter. Run will just fail at the end. But it will pull it. This one just pulls it. So either way, run or pull will just pull the image. We'll eventually have to run a different command anyway. So the run will fail. The pull will not fail. It will just pull the image. And then eventually, it's fine. The run will pull the image and then fail. Pull is nicer because you don't see a failure at the end. But it's all good. Either way, it will pull it down from Docker. So it needs to be pulled down. And it's about 50 megabytes. I don't want everybody to hit the same server, the same moment when we start the exercise. And then the server wouldn't be a problem. I think the internet bandwidth might be an issue here. Not because of India, but because hundreds of people trying to download 50 megabytes. That's like five gigabytes. No internet connection will get that through a Wi-Fi quickly enough. So I suggest you do it. But it does work. I know it worked with Daniel. And also the building that did work with one of our students up here. So it shouldn't be a big deal. If it doesn't work, I'll just sit down next to you. And then we'll do it. Just don't forget to try so that we have a chance of playing a little bit around. Because I think it's going to be tons of fun. So you did the image run? Yeah. It didn't complain at the end. OK. The run obviously fails at the end. But it did download. So it did say pull the image, right? It did not complain at the end. So I don't know if I ran something wrong. No, no. I'm trying to do the pull. And is there something that we should do after that to make sure that it has really been downloaded or something? Just do the pull and it should say that all the layers have been downloaded. So it just says, download the layers. And then if you do the pull, it should say at the end that everything is fine. The image is up to date, that's all right. Yeah, it's up to date. So that's good. Yeah, that's perfect. So the pull actually does work. Yeah, sorry about the run. I'm new to Docker. I had to learn yesterday. So bear with me. I mean, I have a Docker image. But yeah, it's kind of new to me. Docker is a pretty complicated system in the end. But it does actually work. The pull is the right thing. I just learned that actually one of the students was clearly better than I was at using Docker. Also Docker is something that you might want to play around with, because it's used in all large enterprises. So we develop. Some companies will say, we deploy software like, I don't know, 100 times a day. This is what they mean. They deploy Docker. And that's why they can do it 100 times a day, because Docker is so efficient at doing this stuff. OK, so let's go back to where we left off. So we left off with proofs. I don't know if you remember, I said this thing called Satsover's R-Serve Directive Proof Systems. And this is what I want to pick up right now and talk about, so what do we mean by proof systems? What is a proof in this case? And of course, if the problem is unsatisfiable, then everybody thinking the onset proof is super important. That's the thing we really want, because we need to prove unsatisfiability, and the only way to do that is to build this proof. And in particular, depending on the proof system that you use, it determines the minimum size of the proof that you can build. So there are some proof systems, for example, this resolution proof system, that for particular problems, for example, the pigeonhole principle, it will create an exponential size proof, which is to say it will actually run in exponential time when you're lucky. Because actually many of these Satsovers will never terminate, even though worst case is exponential, but actually it will not work, because the heuristics will get messed up, and it will keep on deleting your partial proofs until the point where it just keeps on recreating the proof that it just created, and then keeps on deleting it, and then recreates it, and then it goes into an infinite loop. So it actually is exponential when you're lucky. So it really matters the proof system that you use. And we'll get back to this a little bit later with CDCLT, where you can actually expand on the kind of theories that it understands, and you might be able to prove, for example, the pigeonhole principle in linear number of steps, which is kind of surprising. Or I don't think it's linear, actually. I think it's poinomial with a very low poinomial in front of it. And if, of course, this proof, the minimal size is exponential, you're sort of in a mess, because depending on the number, the exponent, it might not work out, right? If you have 100 pigeons, it will never work. And even though a five-year-old will be able to solve a pigeonhole, I'll show you in a second why. And if the problem is satisfiable, most people would think, OK, well, it's satisfiable, and if I'm lucky, I just branch to the right place, and I'm done. But it turns out that if, let's say that the solution space is exponential in size, which it is, of course, and your proof system turns out that it takes exponential amount of steps to actually prove that there is no solution in those parts of the search space, then there's a very, very high chance that you will never actually jump into or bump into the satisfying assignment. So if we ostensibly could create something like a pigeonhole principle with a single solution, what you would need to do is you prove that there's nothing else on the other side, and then obviously get directed into the solution, or be super lucky and go right into the solution. And of course, softsovers try to do something in between, and they're trying to be lucky, but at the same time building the solution. But if your proof is going to take you exponential time to prove that there's nothing there, then most likely you'll never get there. You'll never get into the solution. But one of the things that really is very confusing about softsovers when it comes to satisfying solutions is that you might think that, oh, I'm just going to get lucky. But of course imagine trying to get lucky with one over two to power of 2,000 probability. I mean, that would be very lucky. If I was that lucky, I would be playing the lot of every day. So of course, you're not going to get lucky. You really do need the proof system. And this is the thing about the proof system that's kind of strange even when your problem is satisfiable. For example, a cryptographic problem. If I create a cryptographic problem with an input and an output and a key, there is, of course, a single solution. You can find the key. You just need to be lucky. The problem is that these systems are designed that you really do need 2 to the power of 128 or 2 to the power of 256 steps. Well, on average, half that. So 2 to the power of 127 steps to find a solution. And that's a lot of steps. I mean, the number of atoms in the universe is, I think, below the power of 300. So you might want to consider that when you're thinking about solution steps and how lucky can you get. I mean, if every atom played a lot of them, it would still not work. So that might not be the right strategy. And this actually is quite easy to validate as well. So let's say that you create a set of XORs, and you give it into a Gauss-Jordan elimination. It's one of these elementary, I'll explain, but it's one of the elementary things that people usually do in a computer science course. You have to do some Gauss-Jordan elimination to get a specific type of matrix out of it, called Roshlon form, and then you read out the solution. And this thing, I mean, I can teach it to you, or you can teach it to basically some elementary school children, and they'll do it in no less than a few hours maybe. And this thing will never terminate, and it will be exponential to find. And it is very easy to validate because you give this thing that you could basically, ostensibly, somebody who's like maybe 30 and 14 could do it in a few hours. And there's a specific single solution, and you give it to a Sotsov range, it will never terminate. I mean, you could say, well, you could just get lucky. I mean, yes, of course it could get lucky, but how many times do you have to run it to get lucky? And for how long? And of course if your matrix is very large, you're back to the point of like, OK, well, there's more atoms in the universe than the chance that I'm going to get this right. So I'm done. This will never work. So generating the proof is really important. It's really important. And how you generate it, and what you generate, and how efficient is this proof generation, et cetera. So now I'm just going to talk a little bit about this proof generation. So here's an example proof. So this problem there actually doesn't have a solution. Basically, if you look at it, I'm saying AX or BX or Z is equal to 0, and AX or BX or Z is equal to 1. And of course, it's impossible that it's both 0 and 1 at the same time, so this is not going to work. And here's one way to prove that there is no solution. So let's try to derive Z here. So we're going to resolve these two clauses here. Just observe that Z is the same here. B is inverted. A is the same. Not A is the same, so you're going to derive not A or Z. Here you're going to take A, the A not BZ, ABZ. Obviously, this clashes on B again, and you're going to get A or Z here, not A or Z. So now we resolve these two, and we get Z. But observe that we could have used other clauses here. We could have used the other two clauses and come up with not B or Z, B or Z, and then derive Z. So there's actually different ways of deriving Z here. So I could have chosen a different way of arranging these, and suddenly become not B or Z, B or Z. And then, of course, that also derives Z. And of course, you can derive not Z here as well, which Z and not Z at the same time is a problem. And here we go, this is exactly it. So here, this is your little proof right here. So you take A, A not ABZ, A not BZ, and now you have A not AZ. Here you have AZ, then you derive Z. Here you do the same thing when I see, and now you derive the empty clause. So this is a proof. The trick here is that how many times are there to derive this thing here? So there's actually a number of different proofs that you can do here. There's a number of different ways you can derive this empty clause. And that's in general quite true, and sometimes the different proofs are actually very different in size. So here they're all the same, because you just rearrange the top clauses, you basically derive here B or Z, B or Z, B or not B or Z, and then you obviously derive Z again, and then you do the same thing again. And then you can derive these two in different ways. This one you can derive in two ways, this can derive in two ways. So actually there are four different ways of deriving this proof. And this is like something super simple. I mean, I basically couldn't get anything smaller than this. And I already have four different ways of doing this. So imagine if it's a larger problem with millions of variables, how many different ways there are. But sometimes, so the clauses that you need to derive the proof sometimes is a single set of clauses. This is called the core. This is a set of clauses that all play part into finding the unsatisfiability. And sometimes there's more than one core. And sometimes there are entirely different cores, cores that actually don't intersect at all. So there's two different sets of clauses that both, if you take either of them, and you put it into a Satsover or any resolution proof system, it will derive the unsatisfiability, so the empty clause. And then if one of the Satsovers starts building one of the proofs that's much larger than the other one, then the other Satsover will build it because it will be deriving the smaller one. So sometimes it's kind of by chance, or sometimes we try to guess which clauses will get us closer to a smaller proof. But sometimes the proof length, for example here, the size of the proof is always the same no matter how you derive it. So sometimes you're stuck because that's the only way to do it. So some observations about this graph. So in general there are many different proofs, right? I just showed you one where it was like the four different ways that you can derive it. There are many. So proof is a directed acyclic graph. I know if you notice there is no cycle here, right? So there's no cycle going around. And it's directed because this thing depends on those two and this one depends on all these four, et cetera, right? So there's arrows here, it's not just a line. And yes, different proofs can be very different in size. So you can derive the same proof sometimes in much shorter time. And also it's not here that the clothes are actually, all the clothes are only used once. You see that this clothes is used once, this clothes is used once, this is used once, this is used once. But that's not in general the case. It can be that this clothes is then later used in another part of the search and this is used together and da, da, da, da. So it doesn't need to be a tree because this is a tree, right? But it doesn't need to be a tree. It just needs to be a dag, a directed acyclic graph. In general you can make a tree out of it. So there's this tree proof thing, but it doesn't actually make much of a difference. It blows up your size in a polynomial by a polynomial factor, but it doesn't make a big difference, of course, well within the polynomial space, right? So it doesn't make it exponential. The input set of clothes is called core of the CNF. So those yellow things that you saw at the top is the core. There are sometimes different cores. There are sometimes smaller cores. There's even something called minimal core and there's minimal unsatisfiable core. There's minimum unsatisfiable core as well. So sometimes you want to know the minimum set of clothes that cause the unsatisfactory. And okay, so why is this interesting? Like why am I interested in what is the minimal set of clothes that cause this unsatisfactory? So let's step a little bit back and let's think of an industry. Okay, so you're in this meeting and you're scheduling this football match, okay? And they give you all these constraints like, hey, I want these things and now schedule me my football match. And your thoughts over it ask you, no, no. There's like, with these constraints, we cannot do it. So what do you do? I mean, you can't just tell them, well, you know, it tells you, no, you know something's wrong with your constraints, like something's wrong. I mean, that's not gonna help, right? I mean, it's gonna be very weird to tell that to someone like, yeah, some of your constraints are not like, you know, there's sort of... But what you want is to tell them, well, these set of constraints together would definitely not work. So you have to relax one of these. And typically not every single constraint they told you, like one of the constraints is for example, it needs to be between January the 1st and the 31st of December. I mean, that's most likely not a constraint that failed, right? So one of the constraints that failed was something else. So you want to have the minimal core, the core of the unsatisfiability that you can give to these people and say, okay, one of these things needs to be changed. Because if you don't change this, then it will definitely be unsatisfiable. And this could be like three of your constraints. Like, you know, one of them was like, it needs to be on Saturday or Sunday and I don't know whatever times. And there was another constraint that said, you know, not on a weekend. And I'm like, okay, well, clearly that's not gonna work because it needs to be either Saturday or Sunday but not on a weekend, you know, like, that's not... That doesn't make sense. So, you know, and then of course it's very clear, you know, like it makes a lot of sense to the people that you're talking to that, you know, your constraints failed. Whereas if you take all the thousand constraints that they gave you and like, well, you know, you tell me which one I need to relax. I mean, they're gonna throw you out of the room. So this, the core is actually a really important concept. So I was actually in this meeting, one, this somebody gave a presentation about, about proving cryptographic protocols correct. And they were like, no, no, no, I'm not using Sato as was the point. You know, I can just do my own decision procedure, right? It's like, I do my own decision procedure is great. And I'm like, okay, so let's tell... So you say, what you prove when you get unsatisfied with the end, the end is that it is not possible that the attacker gets the key. So you prove that there's no way that the attacker can get the key. And you have this massive protocol and at the end you prove that the attacker cannot get the key. And I'm like, so why do you need this massive protocol? Can I take this part out? Can I take this constraint out? You know, will it still be, will it still, the attacker will still not be able to get the key? And it's like, oh, yeah, yeah, you know, I run the whole thing again. I'm like, why would you do that? Use a salt solver for the whole thing, right? You do the decision procedure in the proof building through the salt solver. And then you say, well, you build the unsatisfactory proof, tell me which was part of it. If some part of the protocol was not part of your unsatisfactory proof, well, you can take it out. All the protocol part, all that, you know, sending around the data and computation and everything out on the trash bin, I don't need it. You know, I can still prove, the proof still stands, the attacker cannot take the key. So this is one of these places where, you know, like building your own search system sounds like a great idea and like a massively beautiful idea because, you know, who doesn't know how to do branching left and right? I mean, you know, you tell this to a 16 year old, I'm sure they're gonna write, I mean, I used to write this as a, you know, as a, I don't know, 14 year old and, you know, writing whatever, Pascal back in the days. I mean, anybody can do that. But then if you do this thing, what you get is all these extra bells and whistles like, hey, this is the reason why you cannot get the key. And then you're like, oh, okay. So I don't need this part of the protocol because it's not part of the proof. If I throw this thing out, the proof still stands so the attacker still cannot take the key. And it can be massive. Like, you know, your protocol can have tens of thousands of steps and you don't wanna run it like, oh, let me remove this one. Is it still okay? Okay, let me remove this one. I mean, imagine like, you have to render them thinking 10,000 times. Why would you do that? You run it once, you build the proof, you see the core, you remove everything that's not needed, we're done. So the proof is actually useful and the core is really useful. Okay. So this was just me ranting on something but I think it was very interesting maybe as a demonstration of using these proof systems and using SAT servers and all this tooling that comes with it is actually gonna be useful in the long run because you might be able to build your own search system but you might not want to build your own proof system because it's gonna be painful. Okay, so yeah, there's many different cores sometimes but you can just say, well, you know, one core is good enough, for example, if you wanna prove something, one core is good enough. But if you're in a meeting, for example, with their scheduling, then you would have, you want to know all the different cores, right? All the different cores that are not intersecting with one another because you will have to relax all those constraints in order to get a solution. And cores are super useful, as I said and there are some formulas that if you use the normal proof systems in SAT servers, then you will get exponential proof sizes and there's nothing you can do about it which is kind of a problem. If you stay within the CDCL world and you don't move into the CDCL-T world which is what we're gonna do. The T meaning, it's theory. So you have an extra theory solver that can do more than just normal SAT solving. Yeah, and you will be programming this, hopefully, in about an hour, right? So for proof systems, we actually have been working, this is something that unfortunately, I was not so interested in for a while and then I realized that it is really, really interesting to play with. And the original thing was this, let's write down every single sequence of resolutions that we did for every single close and then of course we can go back in time. So whenever a SAT server does any kind of resolution, we write it down into disk and then when we have to read back this proof, we just go to disk and we roll up. And the problem with that was that this is humongous. So these proofs can be extremely, extremely large. And the idea came that we can do this reverse unit propagation, which is basically saying, well, if there's a, I'm gonna write, if you derived the close, if you derived the close, let's say A or B, so this is your resolvent, so your final resolvent once you have done the resolution. So this was the formula, the original formula and from this formula you derived, so the formula implied this A, A or B, true resolution. So using the resolution operator, you could get to this thing. And a set of resolutions, no intermediary closes, just a set of resolutions down to that. That means that the formula with not A and not B must fail. This must derive, must immediately derive the unsatisfiability when I put unit propagation. So I substitute wherever I see, not A, I substitute one. Whenever I see A, I substitute zero. And whenever I see B, I substitute, I remove that and if I see not B, then that closes satisfied. And if I do this to fixed point, then I should get a failure immediately. And this is very easy to check. So this is called the BCP, Boolean Constraint Propagation. And this is very, very quick and fast with all the data structures that we normally use. So we can check this super quickly. And if you think about it, this is the assumption thing when I just assume A to be zero, B to be zero and run a SAT solver and sees, okay, is this satisfiable or unsatisfiable? And this trick basically sort of reverses the way SAT solvers normally work and allows you to only write down your intermediate resolvents into the disk. So now instead of writing down all the different ways that we managed to derive this resolvent, we're gonna write down, well, this is the output. You can figure out how this happened. And this is called reverse unit propagation. And it's reversed because you sort of reverse the order by saying, well, this was the, from this formula, we derived this close as an intermediary resolvent. And then you can add, if you add to this formula, the effectively the opposite of this, then you derive the empty close. So is the practice that you derived this only you did it through BCP? Yes, so the answer, the question was like, is it that this needs to be through BCP? So the only constraint here is that this needs to be done by just replacing these variables and then running this to completion, replacing the variables until we're done, so until fixed point. And if you do that, you should derive the empty close. If you don't, then this is not true. This doesn't hold. And so when we do the checking of the proof, we do this, we add it to every time and check if that's correct. And then eventually we'll end up with the derivation of the empty close and that's it, we're done. And so the next up after the root was something called the root, which I'm not gonna go into detail, but basically it allows you to delete closes while you're running because the problem with this is that eventually you'll sort of run out of memory and the unit propagation gets quite slow when all the closes are in there. So you can also record not only what closes you derive, but also all the closes that you forgot because as I said, you eventually need to forget some of these proof traces because you just don't have enough memory and you don't have enough time to keep them around forever. I don't know if I'm gonna go into block closes for the moment, but basically there's a way of doing something called extended resolution, which I don't know if you remember there was this bounded variable addition where we added a variable to the formula and the number of closes shrinked. So we added a new variable that was sort of defining some concept that allowed us to get shorter formula. So it was somewhere at the very beginning. I can go up if you really want to. So right here, you see that this was the original formula and you added a new variable and now it's shorter. So this bounded variable addition is basically a form of extended resolution where you don't just resolve closes, but you add the new definitions. Basically the X is defined as a part of these variables. And then you can use that in your formulas. Now in your resolution proof, you can use X. And this can sometimes help. It can sometimes really help in making an exponential resolution into polynomial. We don't really know if this extended resolution is the magical key to make P equals NP, but so far it's quite good. And we can use this tool of adding new variables into your formula to express otherwise exponential proofs in the polynomial size. So this thing is not only a trick to make the formula smaller, it's also a way to make sure that your proof can be smaller than otherwise possible. If you cannot add new variables, then there are some cases where the resolution proof can only be exponential in size. Okay, let's run a little bit. Ta-da-da-da-da. Okay, so now I'm gonna talk about Gauss-Jordan elimination. Let's start where we all start. I know that most of you have heard of Gauss-Jordan elimination and probably did this at high school or even like, I think I had to do this in high school, but sometimes in the university. So what happens is that this is our little matrix. This last column is kind of special because this is the, it's kind of special. I'm just gonna get a little special there. And what happens here is that this is an XOR. So variable one plus XOR variable two, XOR variable four is equal to one. This is how you're supposed to read this. So here, variable one, XOR variable four is equal to zero. Variable one, XOR variable three, variable XOR variable four equals zero. So this is how you're supposed to read this. Okay? And of course we can, this is linear, but linear arithmetic in the space of GF two, the Galov field of two. So you can do quite interesting operations. You can just XOR these two together. So what you can do is you can XOR this whole line into all the other lines. And that's just a linear combination of these equations and everything is still fine. And what we're gonna do is that we're gonna say, well, this line at the very top is going to be responsible for this column here. You see the red column? And I'm gonna XOR this top line into all the lines below. What's gonna happen is that I'm gonna XOR it only in case there's a one here. So you see that I'm gonna XOR this line into these two lines here. And what happens is that this becomes a one. This stays a zero. This becomes a one, because it was a zero. And then the next line that I'm gonna XOR into this one was a zero, now it becomes a one. This was a zero, it was a one there, so that's okay. The two ones cancel each other, it becomes a zero. And there we go. And now what's really nice is that the first column only has a one at the top. Of course the other, these other constraints are just linear combinations of previous constraints. So I didn't cheat. And I'm gonna do the same thing for the next column. We take this is going to be our representative. This row is going to be the representative for this column here. And we're gonna XOR it into everything that has a one below it. So it's gonna get XORed into this row and this row. And obviously these two ones are gonna go away. So now we're here. You see that it's very nice, because we have zeros here, we have zeros here. And obviously you're gonna do the same thing here. So this is a two ones here, we don't like it. We want this thing to be zero. Now we're a lower triangle here. You see that it's all zeros below this triangle and it's all ones across. Kinda nice. Now we're gonna do the same reverse, obviously. So what we're gonna do is we take this column here. We say, okay, well we only want a one here in this part. And we're gonna XOR this line here into that line there. And suddenly that thing is gone. This one is gone there. And we're gonna do this here. So we're gonna say, okay, well this row is going to be responsible for this one here. I'm gonna remove everything, but there's nothing to do here. You see it's all zeros, so I'm all good. Here this, we're gonna do the same. Oh, there's a one, I need to get rid of it. So I XOR this line into this line. So there's a linear combination of two linear equations over the Galov field of two. It's very easy. We just XOR the two lines together and voila. We have a solution. So this is a row echelon form matrix and I can read out the solution. So variable one is zero, variable two is one, variable three is zero, and variable four is zero. We're done. And this is, well, if you do it the stupid way that I just showed, which is not so stupid, I think it's good enough. Basically, it's polynomial number of steps. It's all the end to the power of the, which is quite small, like the worst case. So we're good, we're really good here. There's some more sophisticated versions. I think it can go actually below 2.8. I think it's like 2.7 or six or something. But nobody uses it because of course, you still have to, when you do all this kind of stuff, you have to keep into account, you know, your cashier key and your cash locality and you can do quite a lot of these things with instructions that take multiple data, single instruction, multiple data. So SIMD instructions. This was actually one of the original reasons why SIMD instructions were ever invented. So this, I don't know if you've heard about this, SSC instructions and now it's called ADX and you can have like 512 bits in one go and that sort of stuff. And of course you can make this 512 times faster, that's quite a bit faster. So this is why all these instructions actually make sense. So this is Gauss-Jornalination. And what's interesting about this is that if you describe this matrix here, I mean this one will be solved by a Satsour, but if you create a matrix that's like 20 by 20, let's say, which is, you know, it's very few number of steps. I mean, that's the kind of stuff that in a few hours you can do it. You know, it's like you can sit down and just like play around with it and you'll do it. And also if you give it to a computer program, I mean, write a Python code to do this in what? 30 minutes, like you can probably speed code this in like less than 20 minutes if you really want to. And then it will just do it. You know, it will solve this problem very easily. And then you give that same 2020 matrix to a Satsour and you know, you go away, you know, you wait for the sun to become a wide dwarf and you still wait, you know, and it will never actually terminate. Well, eventually it might, but you might get lucky. If you're lucky, it's exponential. So it will, you know, it's not gonna work. And so this thing is an interesting problem that I bumped into and I wanted to get rid of this issue. Like how can I make this terminate? Like how can I make a Satsour, you know, work with this thing? You know, how can I make a Satsour, you know, solve this problem without running exponential time? All right, so this is where CDCLT comes in. And so CDCLT is like CDCL with theory. So that's the T in the middle right there. And basically you have to imagine it as like CDCR running here, all the stuff that you know from decisions and resolution and branching and restarts and close learning and close forgetting and all that stuff is running here. So it's just normal Satsour as any other. But on the right, you have a theory. And the theory solver can sort of help this Satsour do things. For example, you can give it the assignment stack, the current assignment stack, and ask assignment stack saying like A is equal to zero and B is equal to one. So here's your formula, you know, like play with it, but now, you know, I'm in this decision point and, you know, I'm in A is equal to one, B is equal to zero. Can you tell me something? And then the theory solver is like, yeah, you know, like that's actually wrong. Like A equals zero, B equals one will never work. Yeah, that's not gonna work. Do something. And then, you know, the CDCLT solver can say, okay, well, you know, I must go back. You know, I must either restart or, you know, like a back jump or reverse the last decision because this is clearly not gonna work. I mean, I've been just told that this is wrong. And this theory solver can be quite powerful. So you can have, for example, Gauss-Jordan animation running there. And then suddenly the thing that used to take you, you know, until the sun became a white dwarf is now gonna take you under, I don't know, in less than 10 milliseconds. So the thing that used to take, you know, forever because the proof system is not strong enough to actually resolve, to deal with it, suddenly is extremely fast. And this is, for example, Cryptomaniac has this CDCLT inside where you actually now has more than one theory, although it's not, all of it is released. It does have a theory inside that is released that does Gauss-Jordan animation on the side and runs at the same time Gauss-Jordan animation while the sassor is running. And we'll explain, I'll go into detail how exactly, but the point is that as a theory solver you have a powerful tool that is slow usually, but is, you know, can do things that are really, really cool that the other solver cannot do. And sometimes you can have more than one theory solver so you can have multiple, of course. I mean, you're not, you know, there's not no reason why there should be only one. There could be another one and another one and another one and many, many, many, many. So for example, the theory can be Gauss-Jordan animation. It can be pseudobelion reasoning. It can be symmetric explanation learning where you try to understand symmetries and direct the solver into places where there's not much point in being, for example, in some place, because it's symmetric to another place. So you're like, oh, let's not go there. Let's go to the one, you know, instead of like exploring all the symmetries because now you have to, you know, say, oh well, you know, the bus number one cannot go on route number one. And then you prove that it's not possible than let's do bus number two. I mean, what's the point? It's all the same bus, right? If bus number one cannot do it, then bus number two ain't gonna do it either. It's the same bus, it's the same driver, it's the same fuel, it's never gonna work. So you say, okay, while the bus number one cannot do it and there's no point in trying to prove for any other bus, it's the same, it's the same thing. And so this theory solver can basically say, well, it's all symmetric, don't do it. You know, if you prove it to one part, it's good enough. We know it's not possible for the other parts. And so this theory sort of runs on the side, but you have to be careful because you don't want this theory to take all the time. And actually in Gauss-Jordan animation can easily take, in Crypto Manisa, can easily take 70, 80, 90% of the runtime. It's still worth it, right? Because otherwise you'd wait until you grow old and have grandchildren and it's still not finished. But the thing is that it's still taking a ton of time and you want to make sure that you only call it when you really need to and it runs as short of a time as possible. And it can do all sorts of things. So it can give me new propagations that are implied by the current set of assignments. And it can give me a conflict saying, hey, you're in a part of the search space, there's no solution, I know. Like you don't know because you need to take maybe exponential steps to realize, but I know. So I'm gonna tell you don't do it. And when I tell you don't do it, you can give a reason for it. Like hey, these are the reasons why it's not possible. Of course a trivia reason is to give it like all the decisions. You're like, yeah, all of these things not working. But sometimes you can give a smaller than all the decisions. You can give like actually, not all of these things are important. Like these three variables, that's the one. Like all those three variables that are set into this specific way, it's not gonna work. So if you give that as a reason, then that's better. And it's a very good idea to give a reason. You can actually say you don't, I say must give a reason, but actually you don't have to give a reason. You can just say, well, you know, like just explore the other part of the search space. But imagine that every time you would, the Satsour will have no idea. It's not supposed to go to that part of the search space. It will always go to that part of the search space. You always, the theory is always be like, no, no, that's not a good place. Go back and reverse the decision. It's better to give it a reason like why this is not a good thing. And then the Satsour will remember it as part of the resolvent of the conflict. And then learn from it and not go to the same place again. Okay, so you can do this better. That's some things that you can do. So for example, the previous one said, hey, you know, I'm gonna give the whole assignment stack. Like I'm gonna give all the assignments to the theory and then like now you do your magic, you know? But you can also do is like, hey, the theory solver actually has a state. Like it knows where we are. And I'm not gonna give you all the assignments that happened because it might have been like, you know, like the current assignment stack is like, you know, 10 million variables that are set, you know, like now do something. And now I have to give you all these 10 million variables and you do all these 10, you know, like all the stuff with your magic with these 10 million variables. And that's slow. So what you do is like, hey, actually, only thing that changed is that A and B had been set to zero. Now what? And then maybe the theory solver says, okay, well their previous state was, you know, 10 million minus two variables were set. Now these two variables are extra set. What do I need to do? Is there something that needs to happen? And then this, the delta can give you a significant speed up that you do this delta. But imagine like, nobody has ever written or I have never seen anybody write about how to do Gauss-Jordan elimination with a delta. Like, hey, here's the Gauss-Jordan elimination. You derive your little, you know, almost through echelon form. Now, you know, I changed your matrix a little bit. Now what? Like, okay. Like, now what? Like, I mean, there's no books about this. I mean, there's tons and tons of books about Gauss-Jordan elimination because it's extremely important in a lot of fields, for example, in cryptography. But there's no books about like, hey, you know, like I changed a little bit, your matrix now, deal with it. And that's the same for many other theories where it's not well developed. What happens when, you know, like you can run your little theory, you know, when I give you a set of assignments. But then if I give you some delta assignments, how do you update your theory to match, you know, your changed space, your state, your changed set of assignments? And so we give them basically a delta assignment stack. We say like, hey, these variables were unset, these variables were set, now deal with it. And we can also, the theory solver can give these reasons that can be shorter and maybe even more expressive so you can give more than one reason and things like that. You can also do something called, for example, lazy interpolant generation or lazy reason generation, where you say, if you want the reason why this propagation happened, I can tell you. But I'm not gonna do it now because it's quite expensive. Instead I give you a placeholder and if you click on this placeholder, I'll give you the reason. Which makes it such that you don't have to necessarily understand exactly why something happened unless you really need this reason. And for example, when you're doing your conflict analysis, it might be that this reason is not needed. And at which point it's great because I didn't have to do the work. And the other way of doing it is called greedy interpolant or reason generation where you always compute the reason. Because sometimes it's cheap. Like sometimes once you already know that A needs to be set to one, then you exactly know why. In some cases it's like trivial to read out and it's just like, okay, well I'm just gonna do greedy. But lazy is usually more performant. So this is all about performance at this point. Yes? So you will practically have the theory solvers that call other theory solvers? So the question is, is there any theory solver that does cause other theory solvers? I have never done it, but I guess it's possible. I mean, I guess it's possible. You can also just work through like, so some, I see this solver like the CDCL thing as like my workhorse. And basically the thing that you will be implementing, you could, it actually could work with, or they has Gauss generation here and you will be implementing cardinality constraints. And it's basically the talking through the CDCL. It's like a car dealer, you know? It's like it gets all these propagations, does its magic and it's like, hey, here's the theory solver. Do you have anything? It's like, nothing, okay, how about this theory solver? Ah, yeah, something, okay, update my current state. Give the new state to the other solver. Nothing, okay, good, I need to do a decision. I give it to this theory solver. Nothing, okay, give the theory solver, it gives me something, okay, I do my update. So it's basically this kind of dealer, almost like a dealer with cards, you know? And then all the information flows through the CDCL in my framework, but you don't have to do it that way. I mean, you could do it ostensibly different ways, but I find this to be a good setup because this thing is like pretty robust and I test it quite well. So there's a lot of engineering that goes into this system to test quite well, but it's a good question of like, how do you integrate theory solvers, right? So do you put the theory solver here, the next one here, you know, down the chain somewhere, or do you like integrate it as a star, you know? And I think the star methodology is the one that I have seen and that's the one I'm gonna use and that's the one you're gonna be using, but ostensibly you could, although often these theory solvers are extremely messy. And so you might want to just use the CDCL because it gives you this very tight, nice interface. So the hard part here is that the theory solver needs to keep a state and needs to update and understand the delta. And if it's really cool, then it can actually do this kind of lazy interpolant or lazy reason generation where it doesn't immediately compute why something needs to be set to a value. It can do this kind of placeholder thing like, hey, you need that, I can compute it for you. But the combinator, because it might take me, you know, quite a lot of time to compute it. All right. So what do we need to do this for Gauss-Johl generation, right? So that's what we're gonna do. This is the trick. We're gonna do Gauss-Johl generation and then we're gonna do symmetric, not explanation, we're gonna do just symmetry. Symmetry is in general. So we're gonna do CDCLT and our thing is Gauss-Johl generation. We want that, we need it, we like it. XOR is our thing, because I do cryptography, so that's why XOR is my thing. And okay, so we need a few things, right? So first of all, we need to understand the XORs. Like it might be that, you know, I give you the CNF and now you're like, okay, well, deal with it. I mean, there's not gonna be any XORs in there, right? No, there's no XOR constraint per se. I mean, the XOR constraint might be inside the CNF, you know, in this really, in this form that I explained with all the intermediate variables and all these like lots and lots of closes that look very similar and, you know, are what's called blasted into the CNF. But you need to recover it. So you need to find them out of the CNF and so we need to do that. That's number one. Number two is that there might be more than one matrix in this CNF, right? That might have been tricky and gave you 10 different XOR, 10 different matrices. And if I gave you 10 different matrices, you don't wanna put the 10 different matrices in one big chunky matrix because there's no point. You can do it, you know, 10 times. And remember, it's ordo n cube and n is the size of your matrix. So you want your n to be small, like three times ordo n cube is a lot less than ordo n times three to the power of three, right? In this case, three to the power of three times slower the other way around. So you really want this to be like 20 times slower. So you really want this thing to be cut up into smaller matrices and you want the theory solver. Actually, I instantiate these theory solvers 10 times. So you actually have 10 theory solvers, each of them with a different matrix running because that's the easiest way to do it. And these are, I'm writing C++ code. I mean object oriented code. Of course, this is an object and I just instantiate it 10 times in the 10 different matrices and we're done. And you want the delta update mechanism. So you don't want to start the Gauss generation from the beginning every time. Of course you could, but that's not gonna be efficient because Gauss generation is like, it's pointing on your time but it's still quite a lot of time. So you want two things. One, when a variable is set, like, hey, you know, theory solver, the CDCL solver tells you, hey, A has been set to one, deal with it. And the other is like, A has been unset. We don't know what A is, deal with it. These are the two things that you need to know. These are decisions and propagations are the things when something has been set, deal with it. And the other is like, when we restart or we back-jump or we back-track, we say, oh, well, that was a bad branch when let's go back, it's like, hey, this variable's been unset, it's been unset and now we're gonna set it to some other value. So these are the two things that it needs to understand and deal with. And of course it needs to use some efficient data structure for these kind of quick updates of the internal state. So this state needs to be able to update the update state. That needs to be fast. And we'd like the reason generation. We'd like to know why some value has been propagated or why a conflict is there. I mean, a conflict is quite easy, right? So the conflict is when you have a line like this, but it says zero, everything here is equal to one. And that means that nothing is equal to one, which is like zero is equal to one, which is of course a conflict, like that's not possible, something is wrong. One of the decisions that we made must have been wrong. Or if there's no decisions that we made, then the whole problem is unsatisfiable, that's not possible. So you want the reason for all the things that happen. So we're gonna do that, which is gonna be quite a lot of fun. It's difficult actually, so hopefully you can follow. If not, we can stop at one point and you can also ask questions, don't be afraid. It's, I think it's pretty tough. It took me more than a year to understand and I wrote the damn things. That should give you an answer. But I mean, the thing is, you know when complicated theories are around, the first time you try to understand it, like the original paper on about three sets and the Turing machines, three sets and all this stuff, it looks really, really complicated. And then you read the modern textbook and the whole thing is summarized in one paragraph. And that's what's really exciting. I think from my, it's maybe I'm not talking about SAT now, but in terms of mathematics to me, that's really exciting, that things that used to be extremely difficult, like derivation for example, used to be of course at the edge of knowledge. And now a high school student does it. And that's, I think that's what's really, really exciting about mathematics and computer science as well, hopefully, that things that used to be really difficult for people that are experts are now done regularly by students who just are in university. So just because it took me a year, it doesn't mean it doesn't take you less than 10 seconds. Anyway, not that I'm an expert, but the thing here is that this is, so this is what we want to extract, right? So there's these different X-ORs that are being, like here's an X-OR that is being blasted and this is the way that it is blasted. So here's this X-OR L1, X-OR L2, X-OR X3 is equal to one and now we describe it in this conjunctive normal form, right? So this junction of literals is a close and each of these lines is conjuncted together. So what you have to see here and that's what makes this whole thing a difficult exercise is that this is actually a relatively simple pattern. So if you look at it, you do some kind of Python code, you could most likely recover this relatively quickly, right? It's a quite simple pattern. You see that everything is negated in an even number of times. You see that this has two negations, this has two negations, this has two negations and this has zero negations. So it's actually relatively easy to recover this thing. Here you just look for these, for three long closes that have even number of negations and then you're like, you're good. And if it's odd number of negations then it's equal to zero, amazing. Now the problem here is that sometimes you get this, which is really painful because the thing is that this is equivalent. You see this is equivalent to this but this thing just implies this and slightly more. You see that I removed the literal here? I was tricky. You see that this literal is removed? You see that L3 is not here? I'm being tricky. So what happened here is that this thing still implies this XOR. Of course this thing is stronger than this XOR but it still applies it. So how are you gonna recover this? A simple pattern matching might not do the trick. Because now you're like, where is my, you know, where's my, and then all of them are three long, all this close and all of them are three long now. Like they're like some of them are too long, some of them are three long, like how am I gonna do this? So it's actually quite tricky, but it's still implied. So you still want to get this XOR because it's still implied. It's not the same, it's not equivalent right here, but it's still implied. So we should still detect it. The end here, there's a parenthesis, closes are there, right? Oh yeah, so I mean each line is separate. Yeah, each line is separate. So sorry about that, it's maybe not very clear. So the end is, of course each of those closes are, it's not, yeah, the injunction and conjunction are not associative of course, so you have to be careful. Yeah, there's this, each of these lines should be taken separately and all of them are conjuncted together. Yes, sorry, it's maybe a little confusing, sorry for that. I'm not sure if I should go through all this, the algorithm, but what I'm gonna talk about is that we actually use a form of bloom filters to do some of the search for these closes that are kind of similar, but slightly off. And a bloom filter basically gives you false positives but no false negatives. Which means that for example, it means, so I don't know if you understand what a false positive and false negative is. So the false positive means that hey, it might be it, it might not be it, we're not really sure. But I'm never gonna tell you that it's not it when it could be it. So I'm sure when I tell you no, I'm 100% sure it's no. When I say yes, I'm not so sure. Of course, the stupid bloom filter is just telling you yes all the time, right? It's a very stupid bloom filter, they didn't quite help much because it just tells you yes, yes, yes, yes, yes, yes, yes, yes, yes, and then you have to check it like, oh, is it really? Blooming idiotic filter. Yes, it's a very idiotic filter. But if you do this well, then actually it can speed you up quite a bit because most of the time it will say no, yeah, that's no. And here, like I'm not so sure, check it. And if it gives you a way, it's the quality of the filter will give you a meaningful improvement in the speed of search. And we use this quite often in many different places in South Solvers. And here, this is actually a 32-bit value, your bloom filter, and basically you create a 32-bit bit field, and then you use this bit field for filtering. And the bit field sometimes has more ones than you would like, so it's not exact, but all the places where it has a zero basically we're sure, and that's where we can say no for certain. So you want bloom filter to detect what clauses participate in the extraction? Yeah, exactly. So what we do is that we say, let's say this close is part of an XOR constraint, and now we want to filter through all the clauses that cannot possibly be part of this XOR constraint. And then once we have filtered through all of them, then now we have a chunk of clauses that could or could not be part of the XOR constraint, and then we do the expensive check of like, is it really part of it? I don't think I'm gonna go through this because this will just make you very tired, but the base idea is that we go through every clause. You see this, we say, oh, well, every clause can possibly be part of an XOR. So for every clause, we're gonna say, well, if this clause is really large, let's say 20, then most likely it's not part of an XOR clause because then somebody would have created two to the power of 19, so almost a million clauses to create one XOR constraint, and we're hoping that nobody was that idiotic when they created the CNF. They could have been, but let's hope they weren't, okay? Because most people will cut this down and realize a million clauses does not make quite a lot of sense because if you have 10 XOR constraints, which is very, very small, you now have suddenly 10 million clauses, which is, like, nobody does that. So we put a size limit, which is kind of a heuristic thing, like, if the clause is too large, don't bother with it. Usually this cutting number that I told you about where people cut the XOR into chunks is like maybe at most seven or something because that implies to the power of six clauses, which is, I don't know, like 64, so bad. And then it's larger than that than, no. Otherwise we say, okay, well, we did try this clause already. We will be trying this clause, you know, for seeing if it's part of an XOR clause, so never do it again. This is, like, we remember that this is not it. And now we're gonna, okay, well, what if this clause is actually part of an XOR clause? What would that mean? Like, what other clauses would I need to make an XOR out of this? So let's say that, you know, your clause was A or B or C, and I'm like, okay, well, A or B or C it is, so I'm gonna look for A, X or B, X or C equals some value, either one or zero, right? I mean, in this case, because A or B or C is a particular set, you will know if it's one or zero, but you look for an XOR that has these three variables and see, well, is it there or is it not there? And so this thing here, you will see this thing here will do that. So it says, okay, well, this is my base clause. Like, that's my thing. And I'm gonna say, okay, for all the literals in this clause, I'm gonna go through all the occurrence lists of all the literals that are in this clause. So basically what we do is that we first do a pre-processing of the CNF formula, and we create an occurrence list. For every single literal, I know which clause it is in. So I have maybe 10 variables, and then I have 20 literals, and I have a, literally, I have a data structure that says, okay, well, how about variable one? I mean, literal one, literal not one, and all this kind of stuff. And then, okay, well, it's in this clause, and this clause, and this clause, and this thing is in this clause. This is in clause one, this is in clause two, clause three, this is in clause four. And then, you know, v two is actually in, you know, I don't know, clause two, and clause four, and clause five, et cetera. And then, da, da, da, da, da. And then I can look this up. And if I saw an XOR that I closed, that I think was, gonna be an XOR maybe, you know, it's gonna be v one, XOR, v two, XOR, v four. For example, I'm like, mm-hmm, this is the, this is what I'm looking for. Then I'm gonna look into each of these lookup tables for each of these literals, and check if any of these clauses that are here match this pattern. And of course, if any of those clauses contain any variable that is not v one, v two, or v four, then, you know, not interesting. I mean, that cannot possibly be part of an XOR clause. So now, you see that I do this combing through the CNF really quickly, because I'm not looking at a ton of, you know, a ton of clauses that are here that I'm not interested, because cannot possibly be part of this XOR clause. And what I'm gonna do is that I'm gonna go through all these clauses that you see here, because it could be part of this XOR, and say, okay, is it the same size or smaller? Is it, was it, it cannot be larger than three, right? It cannot have four, the size cannot be more than four. If the size is more than four, then it cannot possibly be part of this XOR. There's no way, because then it would have one more liter all that, where do I put that? So if the size is larger, you know, throw away. Then we do the bloom filter. Is this the bloom filter thing, is the bloom filter still passing? If the bloom filter passes, like, okay, so it's possibly the size kind of okay, the bloom filter says it might be it. Now, let's do the real checking. Let's really check if it's part of it. And basically, I'm gonna gather all the clauses that together can make this thing up. So of course, it's a set of clauses. In this case, it's four clauses that we're looking for that can do this. And I'm gonna look for all those clauses and see if they're inside the CNF. And this combing through is the really hard part, because you mentioned that for every single clause, I have to do this. So that's quite a bit of work. I mean, the clauses that we're looking for here is, ooh, now I'm always confused, but I'll, ooh, the V is a really bad choice. I'm gonna call this A1, A2, and A4. And then I'm gonna say, well, this is, so A1, A2, A4, this is supposed to be zero, which means that the all zero combination is a good one, which means that I need to bend something like 1, 0, 0. So I'm gonna bend 1, 0, 0, one second. One second. So this bends 0, 1, 1, for example. So this bends 1, 1, 0, which is a wrong one to bend. So let's bend this. This bends 1, 1, 1, which is a good thing to bend. And now we know that it's this odd number of inversions, so it's going to be A1 or A2 or A4. And of course it's gonna be A1 or A2 or A4, and then you can follow it up, right? So this has three number of inversions. This has one inversion, one inversion, and then you need one inversion there. So those are the closes that I'm looking for. And when I'm coming through the CNF, I'm gonna look for these closes in here. And the trick is to do it fast. And that's it. So what it basically does is that it grows through all the closes and does this quick check. And then once we have come through all of it, then we need to check if they actually do make up an XOR in the end. So we try to put all the closes that could possibly be part of this XOR into a big bag. And then once the bag is full, I'm like, okay, I'm done. These are all the closes that could be part of this XOR. I come through all this bag and say, okay, is this good? Is this enough? Is this all the closes that I need? And if it's not, then we're done. How do you handle the case where it was not A1 or not A2? Well, if one of the literals are missing, then I still put it into this big bag. So I still put it into the bag of closes that could be it. Because if it's missing a literal, so just to go back a little bit. So if it's missing a literal here, then it could still be part of the close. So this needs to be part of my little bag. This is my little bag. And then checking the bag is going to be a bit more complicated because you're not just looking for this pattern. You're looking for a bit more complicated pattern. I'll talk about it in a little later. But actually I didn't go into the idea of this quick check because the bag verification, because this is the bag that you need. And then verifying this bag is a slightly more complicated scenario, especially if the XOR is kind of large. But in general, you just need to verify that this bag implies this XOR. And of course you should only put closes into this bag that possibly could be part of the XOR. So first you get the bag and then you verify that it's the right thing. And this thing here generates the bag and then you check if it's right. Anyway, next up is that we need to... So next up, what I was talking about is this thing where you have a matrix but it's actually not one matrix but two matrices. And you wanna cut it into two chunks. And the way you do that is you first put every single XOR into a different matrix. So you say, well, every XOR is in a different matrix. And then I'm gonna go through every single XOR individually and every variable that's in that XOR, I'm gonna go into all the XORs that those variables are in and pull it into this one matrix. And iteratively refine that all those matrices that used to be individual matrices are now one big matrix. And I basically just do a merge algorithm. If you think about it, it's just a merge algorithm. I merge the closes together that fit together. And of course the ones that were separate eventually just merge together into different matrices. So it's nothing but a merge algorithm where I assign every single XOR a different bucket. And then I merge the buckets when a variable is found to be in two different buckets. Of course, no variables should be in two different buckets. So when I see a variable that is in two different buckets, I merge the buckets together. And I keep on doing this until I've gone through all the XORs and all the things that could have been merged have been merged. And I'm done with at the end X number of buckets. Maybe one bucket, maybe multiple buckets. So just imagine it as like buckets with different variables that are connecting. And when I see that they're connected, that the same variable's in two different buckets, I put them together. And I just merge, merge, merge, merge, merge, and eventually I'll end up with either separate buckets or a single bucket. We'll see. And that's all this does. It goes through all the XORs and merges the variables together that are found to be together in the same XOR. So if an XOR has A, B, and C in it, then obviously all A, B, and C variables are all belonging in one bucket. And I see another XOR that has C, D, and E that I know, okay, well it's to the same bucket. It's like A, B, C, D, and E now. And et cetera, et cetera. And eventually this merge will terminate and it terminates when I've gone through all the XORs. So it's actually linear in the number of XORs. The number of steps that I have to do. Depending on what you mean by step. I only need to do merges every time when I go through an XOR. And that's all. I just go through all the XORs and I'm done. And I basically keep a lookup of where each XOR belongs and where each variable belongs and I just keep on merging them into the bucket. And eventually I end up with a few buckets. Here are two buckets, clearly. Because they don't intersect. So the issue I have with the delta update with Gaussian elimination, which is what we're gonna talk about now. So write Gaussian elimination, you run it from the beginning to the end, you're happy, you have this row echelon form. And now I tell you, well, actually this column, yeah, it's been set. So now what do you do? So this is what we're gonna talk about now. First, some things to set straight. How do we store these matrices? So if you have a look, all of these matrices only have zeros and ones. So actually you can use a bit packed format. So this is going to be actually less than an integer, right? So it's one, two, three, four, five, six, seven, eight, nine. Okay, so this is gonna be a 16 bit integer. You can store it in 16 bits, right? One line, 16 bits. And you can do this for each of these. So you can have, this is actually four 16 bit integers and we're done. So we don't have to keep, it's not each byte every single one of these bits here. You keep them what's called bit packed format. But the problem with bit packed format is the following. So if I have this bit packed, completely bit packed, then the problem here is that swapping rows is gonna be kind of expensive because if I want to swap this row with this row, okay? I'm gonna swap the two rows. What happens? Now I have to do a copy operation. I need to copy this integer into some value. Then I copy the other one here and I copy the other one there, right? So it's three copies actually. So imagine this matrix having obviously a thousand columns. Now suddenly it's quite an expensive operation to swap two rows. So you don't really want to swap any rows. And the row echelon form that is so before, remember this row echelon form that was very beautiful, this thing? It looks beautiful. I like it. The only problem with it is that if you aim for this visual representation, then you're gonna be in a mess because it will be very expensive because you have to swap rows around. But if you're not interested in the visual representation, what you're interested in is what this actually means. What this means is that every single column here has a row that represents it. So every single column, this column has this row that represents it. This row, this column has this row that represents it. If I swap these columns, it doesn't make a difference. It won't look that nice. Like I agree, it looks very nice that it's the rise and row echelon form. But if I swap these rows around, it's still gonna mean the same thing. If I swap this row to the bottom, it still means that this row is responsible for this column and whatever row it represents is responsible for the other column. So this is very nice because you know that of course the first row is responsible for the first column and the second row is responsible for the second column and the third row is responsible for the third column, et cetera. It's nice and easy. But if I swap them around, it's still the same thing. It just looks a little messier. So we're gonna make it messy because I don't like swapping rows around because it's expensive. You have to copy memory around and I don't like that. You will see that actually memory copy, if you do this with the memory copy, so you want this nice row echelon form, it's gonna take about 90% of your time for nothing. Only so that you can print it very nicely. But otherwise, no reason. So we are not gonna swap rows. What we're gonna do is that we'll know which row is responsible for which column. We need to keep that somewhere. Before it was very easy, you know, row number one. It's column number one. Row number two is column number two. So it was easy. Now it's not so easy. We need another data structure that says, oh well, row number one, yes, it's responsible for column number 20 and row number two is responsible for column number five, da, da, da, da. So we need this very small data structure, right? It's extremely small. It's just all the way. That's all it is. The number of rows there is. So now any row can be responsible for any column's single one. Making sure that that's the one. For example, we can say this row here is responsible for this column here. You see this? Like this row here is responsible for this column here. You see that that's the only one here. And same thing here. So this row here at the bottom is responsible for this column here. You see, it's responsible for it. Nothing else has a one there. And the same thing here. This row here, the third row, is responsible for this column here. Now there's no other one there other than this one. And the last column, the responsible row is the first row. We just swapped it around. I can swap this around and it will look more row echelon form, of course, but I'm lazy and I don't want to do that. Okay, so this is the only thing that you have to get right. Once you realize that you don't need the row echelon form, any row can be responsible for any column, then we're good to go, because then the next slide hopefully may make more sense. Actually, the next, the one after this. One more thing that is kind of interesting about Gaussian animation. So, if you think about it, an XOR can only create a propagation or a conflict if either zero or one variables are unset. So, if you have two variables that are not set in an XOR that are unknown, then one can be a zero, the other one can be one, or the other way around, or one can be a one, and the other one can be a one, or both can be a zero. Like, we don't really know which one is which. So, it doesn't tell you much, but if I tell you that here's this XOR, everything is zero in it, except this one variable that we don't know, and it's all equals to one, well, you know that this variable must be one, or everything in it is whatever, and if you evaluate it, it's equal to one, but that variable is unset, then, okay, this variable must be one. And if everything is set in it, like all the variables are set in this XOR, then you must evaluate if it's correct, you must know if it's correct. So, the XOR cannot do anything if there's two or more variables that are not known, that are unknown. So, the only thing that interests you if there's zero or one variables that are unknown, because then you have to do something. If one variable is unknown, then you have to do propagation. If zero, then you have to check if it's a conflict. You have to check if something is wrong with this XOR, because then it's zero, it is equal to one, and then you're like, oh, you have to tell this XOR, we're here, backtrack, you know, we're wrong, we're wrong. So, what we're gonna do is something called the two variable watch scheme. So, we're gonna watch two variables. We don't care which two, we pick two, that are unset, and then if any of them get set, then we're like, oh, okay, let's watch another variable. And we keep on watching two variables all the time, until something happens and one of them cannot, you cannot find a new variable that is not watchable. So, that is to say there's only either one or zero variables that are unset, and then you have to do something. So, this is the two watch variables, two watch scheme, which is very similar to the, if you know the SAT solvers, two watch literal scheme, there you two watch two literals, here you watch two variables, that's all. And basically this just explains the same thing. So, if two or more variables are unset, then it cannot do anything. If one variable is unset, then you must propagate that one variable, and if zero variable is unset, then you need to check if this XOR is satisfied or unsatisfied. Is it one is equal to one, okay, fine. Or zero is equal to zero, that's also good. But if one equals to one, well, zero, then you have to do something, you have to backtrack. And we're gonna use this terminology from a simplex method, where you use something, what you call basic and non-basic. So, if a column is responsible for, we call the column that the row is responsible for basic, and the column that the row is not responsible for non-basic. Now, I'm gonna talk about this responsible thing instead of basic, because I think it's a lot easier to manage, but if you read literature, you will see this quite often, which is borrowed from the simplex methods terminology. And what do we need for this? Well, we need to watch this for the variables, right? We need to know which column, if a column has a responsible row, and we need to know which row is, so if you give me a row, I need to tell you which column it's not responsible for. Yes, it's gonna be a lot of fun. But I think it's gonna get a lot easier when I show you the actual thing. One thing that you have to observe is often the matrix is not determined. So you can't read out the solution. There's more unknowns than the number of equations, so you're stuck, which means that you wait for the software to make some decisions, and now suddenly things start happening. So your matrix is always gonna look a little like this. More columns than rows. Of course, you want either less or the same, because then you can just read out the solution, but that's not the case. Usually you have many, many more rows than you have columns. Sorry, many, many columns that you have rows, and so you wait, each column is a variable, and you'll wait for the SATs over to assign some columns to some values, and then things will start happening. But until then, you can't really do much. You will try to do the row echelon form kind of thing, right, so the nice little thing that we saw. You will try doing it as far as you can go, and then you'll wait. You wait for the SATs over to update your state, and then you'll keep on updating your state in lockstep with the SATs over. So, for example, here's a matrix. I just wanna show you this one. Remember, these runs are responsible, so this row is responsible for this column, this row is responsible for this column, this row is responsible for this column, this row is responsible for this column. But you see that there are some columns that are not responsible? You see that nobody's responsible for these columns here, because there's not enough rows. So nobody's responsible for the first column, for example. So what I'm gonna do is let's say that we made a decision, okay? The first column has been set to one. The SATs over just told me that variable that is corresponding to this column here has been set to one. Okay, so I'm gonna update my thing. Well, nothing really needs to be updated here. I just set all these things, and I flip all these variables. You see that this was 1, 0, 0, 9, 0, 1, 1, wherever it had a 1, okay? That's all that happened. I set the first column to a value, and now this variable obviously disappeared from all my XORs, and I updated my value to the corresponding value. So this was a 1 here, it's been set to a 1, so it's been XORed in here, done. Okay, what happens next? Interesting thing. So here, you see this, what happened next? This row here, suddenly it has one variable only, here, n is equal to one. So this column has been set. You see that this XOR actually is now triggering. It only has one variable on set. It used to have two variables on set. It used to have this variable on set and this variable on set. You see what happened? So I set this value here, everything is fine, but now this row only has one variable inside. Remember that this last column is kind of strange, don't look at it. So this variable here is going to propagate. This variable is going to get set to one, because this XOR now only has one variable that is on set. Remember that the XOR only does anything if it has one or zero variables on set? It has one variable on set, now it's going to happen. And remember that for each XOR, we watched two variables and here we were watching this variable and this variable for change. And now it changed, so now we have to do something about it, and we're going to do a propagation. So we just did a propagation, we're happy now. Okay, so this variable has been set to a 1. Next up. So we got a propagation and this thing has been set to a 1, remember? And now everybody's happy, this is an all zero row. Zero is equal to zero, everybody's happy. This row could be removed, but we're just going to keep it there for visual purposes. So you see that this variable has been set to a 1 and now one is equal to one and everybody's happy. Okay, now we say there's a new variable that gets decided on, this one here, this one right here. What happens? Well, nothing much really. This has been decided to a zero actually. But what happened here is that you see that this one here used to be, this row used to be responsible for this column here. Well, it no longer can be responsible for that column because that column is, you know. So this row actually doesn't have anything that's responsible for. You see that it's not responsible for anything, it doesn't have any red in it. So this row needs to be responsible for someone. So we're going to make it responsible for this row or this column right there. That's your thing now. Today you're going to be responsible for that one thing. And of course, if you're responsible for this column, it means that now you're in like a wrong state because that shouldn't be there and that shouldn't be there, right? Because you're responsible for that column. You shouldn't have any other ones in there other than the one that you're responsible for. So we're going to exhort this row into the first row and the last row to remove these ones. You see that now it's not, this one and this one, you know, like disappeared. You see that this one and this one disappeared? They disappeared. And we also are like much more row echelon. I don't know if you noticed that we're sort of more at row echelon, like it's nicer, we're getting there. But now we have a new propagation. You see that we did the exhorting thing and suddenly you see what happened with this row at the bottom? You see that it's only got one variable that is on set. Remember the thing with the, if one or zero variables are on set then you have to watch, you have to be vigilant. So now with echelon it's like, oh, something happened here. We need to deal with it. So, and now what we're going to do is that we're going to set this variable to one and now it becomes a zero again. You see that now we have two zero rows. Kind of cool. And the story goes on. I don't know if you noticed this, but the first row actually is also propagating. You see that this is also propagating here? Oh, no, it's not propagating. No, no, it's not. It's also got two variables in there. Yeah, yeah, it's got two variables in there. And of course this, you know, there's no more things that we can make these responsible for because these rows cannot be responsible for anything. It's all zeros. And every other row is correctly responsible for someone, like one of the columns. And so this, so this would go on. All right. So I don't know if this made sense, but you see what I'm doing is I'm just keep on making decisions and propagations. And then if I managed to remove one of the, one of the, for example, here I managed to remove the one that it was responsible for, then I'm like, okay, well this row needs to be responsible for some column. So I assign it whichever, like actually normally we assign the first one that I find because I'm cheap and I'm lazy and I just, whatever, the first one is good enough. And then I make it responsible for it and now I readjust the matrix. So this is called adjusting of the matrix because I have to adjust it to be this kind of modulo row swapping row echelon form. And this is the trick. So this is the quick update mechanism that you use to keep yourself in this, in this row echelon, by this modulo row swapping row echelon form. It's kind of cool. Okay, so one more thing that's kind of interesting is that obviously you, I was talking to you about this reason closed thing. I'm, then I'm not gonna go into reason closes so much other than observe that when I, when I green out this, this thing here, think of the green as nothing but green glasses. There's all the values underneath. And all the values that are, if you don't zero it out, but like you just always, when you know that that closes, that variable is set to a value, then you think that it's a zero when you, when you, when you do this, you think it's zero, but it's actually not a zero. There's some value in there. And if, if the, if you just do the X or over the whole, the whole matrix just as before, then every single line is a linear combination of all the lines before, because the guys, the way the matrix starts, this of course is part of the CNF. This is part of the CNF. And if you just X or these rows together, then it's still part of the CNF. It's just a linear combination of the original X source. So if you look at the green thing here, it's not being green, but still the same thing is here, still the same stuff, but we look at it as green when we do this operation. But when we actually look at the reason close, the green is the thing that, that you would have there if you, you would have just X or them together, the lines together. Then it's just a linear combination. And then that's your reason close. So this thing is actually much more noisy. There's lots of ones here instead of zeros. And all, and everything that's in this matrix is actually just a linear combination of the original X or constraints that you had. And then the reason for anything is you just read out the line because anything that happened is just part of this line because if something happens, for example, this value is set to a certain value, then that means that everything here is false except for that one variable. And you just like, okay, well, that's a reason close. That's a close that everything is falsified except the one literal that I need to set to a certain value. And that's your reason close for a propagation. And if it's a conflict, then what happens is that the evaluation of all the lines that here is zero, but the value is actually equal to one. So that means that all the variables here without the green glasses together are a linear combination of any of the original X or constraints that we had and are themselves are a conflict reason. So you just read out the line straight off and that's it. So the green is just your glasses. You wear the glasses when you do this job of updating your matrix, but when you're not updating your matrix, you don't wear your green glasses and then you can just see the reason close is all there because they're every single line in this matrix is just a linear combination of the original X or constraints that you have. And that's it, that's the trick. So that's it. I won't have more time, much more time. One thing that I want to tell you about is the backtracking. And the good thing about the green, the green glass thing is that when you backtrack, all you do is you remove your green glasses. So like this is your thing, you know? It's like, oh, I'm gonna remove my green glasses. That's right, we're back there. I don't need to do anything. All I need to do is like my green glasses has these columns set to zero, but when a variable is unset, then I just like remove my green glasses. I'm good. That's it. So these green glasses help you both generate your reason closes and also allow you to unset things. And what's really interesting is once you unset something, all you see that all the things that are responsible are still responsible, we're still like, so if you do the backtracking, all the previous and all the invariance that you had are still valid. So each row is still responsible, et cetera, et cetera. So all your responsibilities, if there was a red there, for example, you unset here, you see this green glass, like remove this thing that was responsible for. You see that it was now, this row is not responsible for anything because it used to be responsible for this. But if you unset this thing, it's still gonna be responsible, but it's fine. My invariant is still holds. So removing the green glasses actually makes all your invariants still hold. I mean, it's now a different matrix. It looks different because you might have adjusted the matrix many, many times, and it looks different than you started before, but it's still all the invariants hold. Every single row is responsible for a column, and all the columns wherever somebody's responsible for is still zero, right? So if you remove the green glasses here, this invariant still holds. This row is now responsible for a column, and everything is in column is zeroed out, except for the one. So your invariant still holds. And that's the really cool thing about the green glasses, that backtracking and reason generation are both super easy. And I mean, so now you may be able to appreciate why it's so damn difficult to do Gauss Jordan relation in this way when you have to do data. And because if you do the row echelon form, you give it to a 16 year old and they're like, you know, they do it. And now when you have to do this kind of stuff, it's like, what are my invariants? Do they all hold? What happens when I have to set variables and unset variables? Like how do I manage that? And how do I read out a reason close? How do I make sure that every line, you know, like is a linear combination of the input constraints so that I can create a reason close that is valid over the whole search space, right? Your reason close needs to be valid over the whole search space. I think I'm gonna end there because I think it's a good point to end on that you need all these tricks to understand and make sure that the delta updates are meaningful. And actually the practical session so that we just prepare for the next because we're gonna have lunch now, I think, you will actually get a full set of assignments and you will not do delta. If you really wanna do delta, maybe some adventurous people can do it, but in general, it's much easier, of course, to run everything from the beginning to the end rather than doing this super complicated, like what do I do with this delta update? You know, like, hey, there's a new variable that's been set, how do I update my current state to match the whatever? Instead, we're gonna be like, okay, this is the full state, you deal with it. Which is a lot easier. Of course, it's a lot slower as well, like massively slower, but it's a lot easier to deal with and you still get to understand what it means to be propagation, conflict, and reason. So, and we'll do coordinate constraints which are much easier constraints than this, I think, because it's just at most, actually at most K constraint, but you can have at most one. All the ones that I put down is at most one because it's a lot easier to deal with, but if you want to, you can do at most K. It's no less the same complexity when you write it in Python. And you won't have to deal with all this data stuff, which is a pain to deal with, but also a lot of fun. Okay, I think I'm gonna end there.