 Let's get started. Good morning everyone to the second day of SAT SMPT school. So today we are going to have one tutorial followed by two talks and it's my great pleasure to introduce today's speaker, a close collaborator and a good friend, Dr. Mate Sous. Mate obtained his master's in Budapest University of Technology and Economics in IT security and wrote his PhD in RIA Grinnaub on Privacy Preserving RFIDs. Well, fortunately for us, at that time he started writing the SAT Solver Crypto Miniset and that was a decade ago. I went through the list of solvers in top five for the past decade and it's clear from the list that Mate belongs to this coveted group of about ten people in the world who have built a solver from scratch and keep up with all the developments for over a decade. Of course, Mate does not just keep up with the developments but has been leader in developing a solver that is versatile and has won SAT res in 2010, won incremental track of SAT competition in 2015 and 2016. Furthermore, the solver that Mate writes, Crypto Miniset, performs tasks that are beyond serious solvers and closer to my heart is Crypto Miniset's ability to support Gaussian elimination which has been very crucial for the existence of hashing base techniques. Mate is also the lead maintainer of STP that has again won several medals in SMT competition over the years and he also maintains approximately the tool for model counting so we could not think of a better speaker to give us all about CDCL, SAT Solving and beyond CDCL. So let's welcome Mate Suss and we look forward to the tutorial from him. Hi everyone, I'm Mate. I have to admit I have a bit of a jet lag so please forgive me for any slight missteps, hopefully it will be fine. So I'm going to talk about SAT Solving and CDCLT. Actually I'm going to cut it into exactly two parts. So there is this session which is going to be more about SAT Solving in general and then we'll dig into CDCLT in the next session. And in the final session we're going to have a practical sort of thing that hopefully will bring all of this together into one sort of coherent picture where you can actually play with all of these things on your laptops using Python which is basically the easiest way that I could think of to make this comprehensible and also like writable by anyone in this room because Python is a relatively uniform language in the sense that most people can program it in that. So you have to take into account that most SAT Solvers that are winning the competition are written in C, C++, but they are like actually the most deployed SAT Solver happens to be written in Java which you will hear about later or at least from the author of which you'll hear about later called SAT4J. And I didn't want you to push you to write C++ code because it's very cumbersome as you might know. So I wrote these slides with the help from Armin, Armin Deere who is if you know SAT you must have seen some of his papers at least who generously donated some of these slides. I will tell you when we switch over to my set of slides I mean a good chunk of it has been edited and rearranged for this session obviously but all the CDCRT is written by me and then some of the SAT Solving I took from Armin because he is one of the lead SAT developers and I will be actually talking about some of his code as well because his Solver beyond being one of the best is actually one of the most comprehensible as well which is quite an achievement actually especially if you look at some of the SAT Solvers that are winning nowadays that are incomprehensible including mine actually. So with that out of the way let's start with some introductions so we're going to be stuck with each other for four and a half hours which is quite a long while so I think we should do it's kind of a relationship we're going to have and I think we should try to do some introductions of each other so I'm going to start and then you can follow up. So as mentioned by Kudip, thank you so much. I finished my PhD at Inria in Grenovo and I did it in security and privacy and I am the maintainer of Crypto Minisart, STP, Approximacy so a few tools related to SAT Solving and CDCRT and I actually work as a senior research fellow at National Industries of Singapore with Kudip about three months a year nowadays and nine months a year so the majority of my time actually I work at a company called Zalando which is the largest online retail store for fashion in all of Europe with like some eight billion turnover we just made over a billion last week because it was the cyber week, I don't know if you know like this Black Friday, Cyber Monday, it's very American sort of tradition that has been introduced to Europe so in a single week we managed to make a billion euros of turnover and I do security there so if you plan for me and step me to a chair I can also do an IT security presentation if I really have to but I very much enjoy doing SAT Solving so that's kind of used to be my hobby and now thanks to Kudip I have a framework to doing it and I like doing a lot of different things in SAT Solving I of course CDCL and CDCLT are very interesting but I like other stuff too I work in machine learning, visualization, counting and higher level abstractions that's sort of CDCLT and other things you'll see because I think they're fun and now it's up to you so let's say I'm kind of curious who works in industry here so by industry I mean not strictly only in academia consulting, just writing code any kind of data engineering, data science that sort of stuff for any company okay so there's quite a few who considers themselves to be a student that could also be a professor I'm just saying okay that's cool and who are the people who work professionally in academia okay and who are the people I also call myself hobbyist which is a very interesting combination of neither in academia nor in the industry so is there anyone who considers themselves this is kind of a fun thing they do this is what they're interested in but it's not something that they actually get paid for doing SAT Solving or any of these SMT let's say or verification and that sort of stuff okay there are a few that's good that's good yes alright so we did a little bit of introductions I think that's good to know who is sitting here yes so basically I'm going to start with this slide which is quite a funny one but the point is that it sort of introduces propositional logic where you have variables in this case shirt and jewelry well clearly I'm not wearing a shirt but I am wearing jewelry and you have all these different symbols like this junction, conjunction and negation so these are like the basic symbols that you need plus the variables and then if a variable so from the variable we can also derive literals and a literal is basically either just a variable on its own or it is the negation of a variable so if you have a look at the right hand side there then you see not jewelry or shirt, jewelry or shirt not jewelry or not shirt so these are different ways of talking about whether a speaker is wearing one or the other and the idea is that you're supposed to wear either one or the other otherwise it's kind of impolite so most the formulas that I'll be talking about will be looking like this so this is what's called conjunctive normal form where you have a disjunction of literals is called a close and then a conjunction of closes is called a conjunctive normal form and of course we can express and complete problems in this and here the variables were just jewelry and shirt but of course you can have any number of variables and up to millions, tens of millions, hundreds of millions we can solve them it's not the number of variables that matter so just actually on that note I'll just take a segue a little bit most people think the number of variables is going to make a big difference but you can generate or create problems in a few hundred variable range that no set solver currently can solve and maybe just under a thousand variables that if you can actually solve you can definitely get a couple of million maybe a couple hundred million euros for it and nobody can solve it and yet at the same time most industrial problems contain well over a hundred thousand variables and we can easily solve them so it's not the number of variables that matters it's the underlying hardness of the problem that is supposed to be hard of course that's what I was talking about when I said under a thousand variables so let's say AES it will be extremely hard to solve because of course if you can solve AES there will be people out there who will pay you at least a hundred million without a blink of an eye the problem is that we can't solve them obviously with any tool including SATSOV whereas at the same time easy problems translated to this large number of variables the SATSOV will easily be able to solve so this is the trip that most people trip into when they start doing SATSOV and they start asking like how about I have two thousand variables will it work? I'm like I don't know is it two thousand variables about reversing AES then the answer is no if two thousand variables comes from some simple scheduling problem then the answer is yes so the underlying complexity of the problem will make the difference not the number of variables so we can look broadly at SATSOV of course it's a bubble its own but within this bubble you can look at SATSOV as a way of encoding the problem into this CNF notation that I just introduced that's called the encoding part right at the top and then you simplify this encoding using some techniques I might have some time to talk about some of these techniques and then you do some form of search on the simplified CNF and then if you're really good what you do is that you actually do something called in-processing where you go back to the simplifying step and you actually re-simplify your formula once you have done some search and if you're even better like Armin or some of the more advanced solvers like Kadi-Kaal or Crypto-Minisat probably also set for J is that you can actually do re-encoding so that means that you actually go back to the original or try to go back to the original formulation from the CNF and then re-encode the things that you find there they typically do this for example for cardinality constraints where the cardinality constraint was encoded wrongly so what you do is that you recover the encoding you throw away everything that was the old encoding and you re-encode it so it turns out that you're going to be faster so this is where we are normally at in a modern set solver but in this talk we'll talk mostly about search that's why it's so in large letters but we'll mention simplifying and maybe even re-encoding when we talk about CDCRT later in the talk so I'm just going to talk a little bit about what Sassowers are useful for there's been some talks already especially not about C codes because it's not something that everybody understands necessarily but one thing that we could do with Sassowing is trying to check optimizations I don't know if you've heard about this there's actually a really interesting tool that does this for LLVM IR which is this intermediate representation that LLVM uses and then you can cause something called super which actually uses an underlying Sassower I think it actually uses Z3 well it's an SMT solver but of course it has a Sassower underneath it to do optimization on LLVM IR and it actually just checks if the optimized version is equivalent than the other one so it does this equivalence checking that is to say there's an unoptimized code there's an optimized code and then it does hey is the two equivalent of course for this you actually need to understand the underlying syntax and you need to understand what all those things do otherwise it doesn't make any sense and in the LLVM IR case you can parse LLVM IR and understand each and every line of code that LLVM IR can output and what does it mean what does it mean that you increment the variable and we can, and a lot of these a lot of these equivalence checkings can be let's say translated down to using this Taysend transformation which is effectively a form of electrical circuit so if you look at the left it's truly an electrical circuit but of course you can make the left hand side this electrical circuit out of the code that we just saw and then you make this let's say higher level description on the right you see that for example the output O is an XOR of Y and W so that's the right here you see this thing here is this here and then I don't know the output V here is B or C you see here B or C etc so this is sort of a higher level description of what's here and then you can sort of what we call blasting down so we encode these constraints into a more refined constraint here which is just saying X implies A and X implies C obviously this means X implies A and X implies C and also A and C implies X so and from here we can go down to the actual CNF here which is the same if you think about it it's just that it's expressed in this more let's say expanded format but at the same time it is a very specific it's a normal form so it uses only a disjunction of literals here and conjunct it all together so about that so how many different things can be expressed well there's lots we can express I'm just going to talk a few so there's the negation which is quite obvious there's disjunction X implies Y or Z you make this out of it this is a standard encoding of this formula that you see on the left and then this conjunction of course as well which is more or less the same except everything is inverted and now there's also if that else gates I mean I'm not going to go into details but you can read through this quite easily it's basically the C is going to select which of the ones we're going to take which of the two inputs we're going to take and then you can translate it down into this and what's interesting here is that now we start bumping into something what's called R consistency which is R consistency means that I set some of these variables so these are all these input variables here and if I start setting them it might or might not force certain other variables to be a certain value and the encoding sometimes doesn't actually express that so the encoding can be lazy in the sense that it's not R consistent which means that if I set some of the variables which are clearly inconsistent it will actually not fail it will not give me the empty close when I substitute all the variables inside there but our consistency is expensive in the sense that it means that we have to write now more closes to express the same underlying expression for example negation I mean negation is obvious but some other more higher level descriptions for example this if and else this if and else a gate you start needing to write additional closes the ones here actually these two closes if I don't add these two closes then this thing will not be R consistent so this is still true this is still correct but it's not R consistent in the sense that some of the combination of the variables if I substitute them in will actually not give me an empty close it will not fail even though it is a failure it is incorrect so if you start substituting in and realize that it is not a possible combination yet this set of closes will not tell you that immediately you would have to do a search on it and so here comes this kind of tradeoff do you want this thing to be R consistent and so every time fail when I put in the variables that are inconsistent or do you want this to be fast because I can substitute the variables in very quickly because I don't have to add these two extra closes and a lot of sort of the tradeoff or this issue yes so the flaws as to where it becomes empty because you do the implications of what we're substituting in your finding yes so the question is does R consistency mean in this case that if we substitute the variables in that are actually an inconsistent set of assignments will it derive the empty close is that the definition of R consistency and that is the definition and it is also correct that in search you will derive the empty close but now you have to do search it's not good enough for you to substitute the variables in so for example this one here will not derive the empty close if T0, E0 and X is equal to 1 because if T0 and so it will not derive that that close is not in here yet that is not a consistent setting for this gate you would have to derive that through search and this tradeoff happens quite often especially with more complicated constraints that the moment you start having R consistency the moment you will have to have many many many more closes in order to enforce this R consistency and instead a lot of modern set servers do is that they actually derive these what we can call redundant constraints because I can remove them and the CNF still means exactly the same thing as it did before because I can derive this constraint from these constraints through search and resolution and the new set servers will actually derive them using this in-processing technique that is here so they will derive these through this and sometimes through this depending which one which one we're going to choose and I'm going to talk about X over constraints not only because I have a thing for them it's also because we use them quite often in doing counting and sampling and what's interesting here is that this is a very simple constraint A plus AX or B is equal to 1 or let's say L1 X or 2X or 3 is equal to 1 etc but as you see the number of the length of this very simple constraint if you think about it it's an extremely simple constraint all you have to do is a parity constraint so if everything is 0 here this is not going to be okay at least one of them needs to be a 1 or 3 of them needs to be a 1 that's also okay so this is a very simple constraint but if you have a look at the straight translation of this constraint it starts growing and actually it does grow as you would expect so like 2, 4, 8, 16 etc so this is exponential so that's not going to work so you will need to introduce new variables otherwise you will end up with exponential number of closes to express this super simple constraint and so what you do is that you of course add 2 helper variables so what you do is that you add 2 helper variables in this case actually 3 helper variables to cut this constraint into 3 chunks so now it's like L1, L2, L3 helper variable helper variable, L4, L5 helper variable helper variable, L6, L7 if you add this all together in XOR of course all the helper variables drop out and you get back to this constraint but you add the 3 helper variables so we needed extra variables to extract it to express this very simple constraint so if you work in the polynomial fields in GF2 and you work with polynomials then this is something that is one of the simplest equation that you can write basically and it's going to take you exponential number of closes to do without helper variables so you will need helper variables and this is one of the places where people start asking how many helper variables should I add the number of variables is supposed to be the complexity of the problem and this goes back to the original discussion that I had in the beginning in that the number of variables doesn't matter it's the underlying problem that matters so don't worry about adding variables we actually add thousands of variables every time we add new XORs and we don't care, I don't even count that's not what matters what matters is the underlying complexity of the problem that you put in and the reasoning engines maybe power to these things that are in there so this thing is called the cutting number you see that I cut it into at most four length closes so this is called the cutting number when it comes to XORs there are many many other constraints with all their own parameters and their own parameter spaces and different kinds of things you can write about our consistency etc etc here is the cutting number is quite easy so you just cut it into chunks you can of course cut it into five long chunks and six long chunks and seven long chunks into a hundred long chunks you'll have to the power of 99 long 99 causes for each and every one of them so you probably want to cut it into shorter because of the exponential nature of course we still need to translate each of these individually into a close okay so I'm going to talk a little bit about the current and the constraints although we will actually play with this which is kind of a really cool thing and there's somebody in the audience I know who is an expert in this and I shouldn't be talking about this to be him because I'm sure he'll do a better job but basically a current and constraint is we just now want to count how many of the literals in this constraint is assigned to true and we upper bound this by a certain number and it's called at most K so we want at most 10 or at most five of these literals to be true and like this and there are many many different encodings of current and constraints and I will not go into it but you will have the chance of actually implementing not an encoding but a CDCLT solver which basically just encodes in python you encode all the different ways that a current and constraint can perform a conflict so something is inconsistent like detecting an inconsistency and also doing a propagation where you're just just off of an inconsistency if anything happens you will be inconsistent and you can detect this and then perform actions to avoid getting into an inconsistency so inconsistency in our case with what we call conflict which is to say we're in a position and a set of assignments that we have made that are conflicting they're not right something is off so that's something that we like to detect for example when it comes to current and constraints this again goes back to our consistency so if you have a current and constraint encoding that is are consistent you will always detect if something is off like it's an at most k constraint and you have k plus 1 set that you should always detect it but another thing that you should detect is that if you have exactly k set then of course everything else must be 0 all the other literals must be 0 if it's an at most constraint of 3 and you have 10 literals and 3 of them already then you know that everything else must be set to 0 otherwise this thing will fail and that's also part of our consistency where you may make sure that nothing can fail from now on now that we know that these are the 3 that are set out of these 10 the other 7 must be 0 and there are many many different encodings and then one of the ones that are really sort of interesting is the at most 1 constraint of course I was talking about at most k constraint which k is quite a variable you can set to any value interesting in many places and of course encoding at most 1 is easier than encoding at most k because of course there's more variance when it comes to at most k all right and now let's talk about the DIMEX format which is just a very simple format to actually write c and f for a set solver so when you want to use a set solver you need to use this this file format to describe what you want to what you want to the set solver to work on and here it usually has a dot c and f as an extension and here we have the header which tells me that this is a c and f and it's got 2 variables and 3 constraints you see that there's 3 constraints and there's 2 variables 1 and 2 and we have the actual closes but each of these lines is a disjunction of literals and each of these lines must be satisfied so there's no line here that's unsatisfied if we get a solution and this solution will satisfy all these constraints so minus 1, 2 will satisfy this because minus 1 is set to 1 minus 1, 2 will satisfy this because 2 is set to 1 and minus 1, 2 will satisfy this because 2 is set to 1 so this must be read the first variable is false and the second variable is true and the 0 at the end just terminates the line and this is hyperx here as well so this is a literal that is inverted this is a variable that's inverted so it's a literal with a negation this is a literal on its own and then the termination of the line some set solvers you will see that they don't actually care about the header and most set solvers can be instructed not to care about the header in particular, MINISA doesn't care about the header most people actually don't put the header in and then some set solvers will crash underneath them and they're very surprised when that happens but MINISA is kind of special and actually my set solver also doesn't because it's built on MINISA but also because people make this mistake all the time and I don't want to bother them too much but having the header is quite nice because of course you can preallocate some data structures etc so you will see that if you put the header in it sometimes works a lot faster and here this is just using PicoSat which is a very, very small set solver that is used all over the place because it's so tiny and it's pure with and pure latency and then the C lines that you see here this is actually non-standard but the C line means that it's a comment and you can add comments to any at the end of any line or at any new line in general actually you're not supposed to do this funny enough but since Armin does this and I do this and almost everybody does this it's more or less standard at this point that you can add comments at the end of closes so this is of course saying not jury or shirt, jury or shirt not jury or not shirt okay so if you want to use set solvers then often you will not be using them from the command line because if you use the command line then use this IMAX format that I just described but often you want to use it as part of a larger whole so it's very rare that set solver actually does your problem your problem is sat I've yet to meet an industrial person who comes to me and says my problem is sat their problem is scheduling their problem is routing, their problem is all these kind of higher level problems and then you as a consultant or a researcher or a hobbyist or whatever you think yourself to be you come in and use this as a tool to fix their problem and so you will want to use the set solver as part of your system and for that you need an API so an application programming interface where you can call this set solver over and over again with different problems different encodings with some online problem for example if it's a routing or picking problem like a picking problem is when people are in a warehouse and they go around a little cart and they need to pick the new items for the new orders that come in and obviously you need to run the sets over all the time because the new orders coming in and you're trying to optimize the routing schedule or the picking schedule the way they pick the items from the shelves and and for that you need some kind of programming interface and so this program interface was basically created what's called IPASIR that is sort of a standard interface for all satsovers so you can sort of just change the satsover underneath and you don't realize that you have changed it and this was mostly derived from Minisat's original API and it has of course you can add closes you can call the satsover you can also retract some closes so you can remove some closes and say okay well actually that close I didn't want to add to do that you do something called push pop so you can push and pop state you can say oh well push the state I want to remember this state do some things and then you can pop the state you can go back to the place you were before this sometimes is important because you reach some point in your search which might be a higher level search you want to do some subsurge and then you want to go back to the original position and do some other work below and we do this through this what's called assumptions which is to say that we can solve under assumptions so you can solve the problem on its own and then it will be solved but you can also say let's solve this but a is actually zero and what's nice about that is that when you solve it and you get a solution of course a is going to be zero but if you get a solution that says no this unsatisfiable you can say okay well then let's solve it a is equal to one or let's solve it without any constraint a can be either zero or one so you can go back to the original the original the original problem which is nice because if under the assumption of a is equal to zero it was unsatisfiable you can never go back of course in a normal c and f setting because once a problem is unsatisfiable it's unsatisfiable there's nothing you can do about it but solving with a assumptions allows you to do this trick and a many many systems for example our system for for solution counting and for sampling uses this heavily and you will see that a lot of systems that use that solvers will use solving under assumptions heavy and this is the interface I'll just explain here in a in a little bit more detail that's the model so here is where you know you have the salt solver you know add add variables what not and then then now it's like solving and then if it's on set like if the solving you know finish it unsatisfiable then you're unsatisfiable bad luck and but if you if you solve under under assumptions you can still go back and start again and of course this sometimes is unknown you can or certain limits on you know how much time or resources you want to spend on trying to solve a particular problem maybe this is actually a better one so this is more like maybe easier to read than the previous slide I would say so here you see you initialize the thing and then you add literals to it and then you solve it and then you can query if it's it's failed and you can also this is where you can add a termination criterion for example time or whatever you want to do and here's a very simple one where you this is C++ of course here but Ipashi actually I think has a Java interface as well does it I'm wondering does it have a Java interface I think it does yes because I remember writing something about that and and you will see that here's the solver we initialize the solver we add this the constraints that we saw before you see this the literals so minus tie at zero so the same thing that we did before right so the literals one by one literals one by one zero to terminate literals one by one zero to terminate now we're going to solve we're going to make sure that the solution is ten which is satisfiable and now we print satisfiable you know print the value of shirt print the value of tie and and and the line and now we're going to assume that the tie and shirt you see we're going to assume the tie and the shirt we're going to solve the result is unsatisfiable there's no such solution and we said hey it failed and we we basically release the solver and return so this is the a very easy sort of use of of this Ipashi API program I think in the first you will use this if you want to if you want to use that solvers because the most efficient way of using soft solvers is through this interface if you start using soft solvers through the common line you're going to bump into speed because they will have to read up the file every time you would have it's also not very like not a very convenient way of using a software from within a system because you have to now start a new thread you have to run this executable check the output it's a pain to use it this way it's much more efficient and this is what we do for example when we do any kind of high level constraints for example counting which has a bunch of high level constraints we use effectively an interface just like this alright so now let's talk about this magical search that I haven't yet explained so it all started with something called dp and then like we went into dpll and then we went into cdcl but let's start with this Davis Putnam procedure which is effectively originally a resolution based system it's actually not a search system I'll explain in a second and then the second one was dpllt which basically did a tradeoff between memory usage and time and basically this dp version kept on eliminating variables I'll explain that in a second just from the top to the bottom and it tries to derive the empty close of course the empty close means it's unsatisfiable and the this second version so the dp llt basically does a branching version of this instead of trying to incrementally eliminate the variables from the from the formula and try to derive the empty close it just does a case by case analysis like what would happen if x is 0 what would happen if x is 1 so now we have 2 and then of course we can do this with y underneath etc so let me just jump into here so this is dpll where we do a decision at the top and then we do another decision and then we reach some some point at the bottom here down which is which may or may not be unsatisfiable then we're going to go back and try this not go back down this just notice that actually I can pick another decision like once I'm finished with this branch you see that here I the a b and c is the order but here it's a c and b the order so the order doesn't it's not it's not doesn't it doesn't matter which order I pick once I'm in a in a in a new branch but within the same branch of course I cannot just you know randomly choose an order here anyway the dp procedure so the the decision-based procedure just does a very simple system where if the formula is empty then we're satisfiable there's nothing to satisfy but done if it contains the empty close then it's unsatisfiable otherwise pick a variable add the resolvance on x on this variable and then remove the closes that contain x or not x and start again right and eventually all the variables will disappear and I'm done the problem here is that this can be this this this this issue here this add or resolvance on x this can actually be exponential in if if you if if the if the formula is of a specific type and for example if it's all these x or stuff you will realize very quickly that this will just blow up and will never fit into your memory so of course the next idea was like why well instead of adding all the resolvance we're just going to branch left and right branch left and right and eventually we have searched through all the search space there's nothing left so here let's let's talk about this like so I was talking about like doing the the the resolvance so there on the variable elimination is a version of this resolvance thing except that we bound it in the sense that we don't always resolve them but the only way to resolve version is actually very easy so this is your your set of clauses here so some of them contain not x you see like three of them contain not x and two of them contain x and now we're going to resolve everything with everything so we resolve like not x so one with five for example you will see that you'll get a or d because the x obviously drops out and you get a and d so like let's say x or not a or not b if you resolve that with not x or c then you get c the x drops out not a and b so that's what you get here and you see that actually this is one two three four clauses and this is actually five clauses so now we have actually less clauses and one less variable so this sounds like a win-win but of course this can happen that this complete resolution on x on the left hand side can result in many many many more clauses on the right hand side right so if there were like ten here from not x and ten here from x then you can actually have a hundred on the other side right so you know you started with twenty and now you ended with a hundred imagine if you do this with every variable of course it's just going to go crazy it's never going to fit that into memory so that was the original dp procedure that I explained here right so this is add or resolve on x that's the thing that we did here except that here we got lucky the resolvent is actually less than the original and this bounded variable and nation actually was a massive massive leap forward in 2005 so without this we would be stuck back we would be back in the stone age effectively because this made it possible to solve many many more instances because it turns out that if you do this as a pre-processing step so remember the encoding simplifying and search so this is squarely in the simplifying part if you do this as a form of simplification of course you try every single variable which are the ones that I can eliminate and still have less closes that I started with like it's a win-win I got one less close and one less variable in this case like that sounds good to me actually this has lately been even improved by slightly unbounded so you're allowed to grow a little bit so you're allowed to grow a few more closes we can do a tradeoff between a few more closes and one variable less so sometimes it still grows the cnf but with less variables and relatively new thing is this bounded variable addition which is basically adding variables to remove closes it's a very interesting sort of idea the really hard part is how to do it but effectively here is a set of closes I'm not going to go through it we could but I think it's not worth your time so here's a set of input closes that only has a a, d and a, b, c d and e inside and on the right hand side you see that we actually have x as well not just a, b, c, d and e but we have less closes you see that we started with a 6 and we ended up with 5 even if you do the resolution so if you actually reverse this thing and you go back here you take these you go back here you'll get this so now it's sort of like flipping the whole thing around and adding a variable to remove closes yes so in the bounded variable elimination so is there a simple way of figuring out whether resolving with respect to one variable is going to lead to a reduction or simple is a good question there you can use good data structures to help you do this so the question was how easy it is to figure out that this is going to lead to something less because you need to figure it out for every single variable actually and the way we do it is that we actually simulate it so if you have a look at all the sub solvers currently is that we actually simulate and then we start resolving and like oh wait now we're more like abort abort go back and start again and the way we do it is that we also order the way we're going to start resolving because the order matters if you think about it if you start resolving x here and now you decide oh well I'm going to now do the resolvent on d now you have just changed the resolvent for d so it's not it's actually the order really matters you do this to fix you never do this to fix point so it really is going to make a difference and even if you did this to fix point I believe it would make a difference so the point is that we actually order the variables to start with the ones that have very few occurrences so if x only occurs twice most likely I can eliminate it and it will not grow if it occurs a thousand times like imagine now I have to do at you know like maybe up to 500 resolutions you know I mean yeah so like tons and tons of resolutions that I have to do here to figure out if I can really eliminate not like actually so if it's 100 times then it's like up to in there 100 times and I might need to do up to 50 squared resolutions to see you know if it's good or not and I would actually never even do it so there's some heuristics that just says this variable is inside the closes so often that there's no point in trying yes so all of this method it seems like there is a third of that the point is to try to minimize either the number of variables or the number of closes but as you indicated below before minimizing the number of variables hardly solves the problem if there is an unsatisfiability reason for the because of the formula itself so are there any other reasons to perform this other than the other than minimization? Yes so the question is correct like I was saying well the number of variables doesn't matter but it turns out in the complexity of solving the problem on its own it doesn't but all the heuristics use a lot it makes a big difference that the heuristics are not let's say if you do picking of variables when you do branching here it matters which variable you're going to branch on for example and that's a heuristic that we use and the thing is if you have fewer variables then less mistakes you can make in branching and the more information you will have on each variable because what we do is we track how branching decisions affect our performance and then we're going to readjust our branching for a better performance and having less variables makes these heuristics a lot better and also in terms of the usage of the memory so what happens is that we use data structures that are specifically optimized for cache locality and if you have a lot of variables then you will be resolving pointers all over the place all the time so we jumping around a lot in memory and it won't actually be like the order the order number of maybe the order number of steps you'll be taking is the same but the time it takes to those steps will take you longer because of the number of variables that are inside maybe I'll have time to talk about this two watch literal scheme and we effectively have a form of occurrence list it's kind of optimized occurrence list so for every single very literal actually we have a separate list like a bunch of pointers and you'll have to jump to all these things so the less variables you have the more compact all your problem is inside this data structure this particular data structure and then it means that you're going to have a higher chance of whenever trying to branch on a variable to create a variable so whenever you're trying to do an operation it will take you less time to do it because your data structures are going to be more compact because there's going to be less variables so that's the answer it's actually less to do with algorithmic sort of thing and more to do with just practical ways how many memory hierarchies work so when it comes to start solving it's one of these weird areas and that's one of the things I like about it maybe this is actually a good question to talk about this for some while so start solving is really weird because you have this really high high level theoretical concepts like proof complexity which will bound your start solver if the problem is symmetric for example then you're more or less screwed you'll never be able to do this thing you'll never be able to prove unsatisfiability because the proof would take you billions of years just to write down whereas at the same time you have to optimize for cache hierarchy of current CPUs which is very strange because most research is either one or the other either you want to do this very optimized code that really takes advantage of all the bits and pieces of modern CPUs to speed up the computation or you do this very high level work where all you think about is proof complexity and width and all this kind of graph width and theoretical concepts effectively but start sort of like that's the whole thing from A to Z I like more the more practical stuff but you'll see that actually the proof thing is kind of interesting and this higher level concepts are interesting but in this case we actually jumped into the more low level concepts where it really is the memory hierarchy and the cache basically just cache misses are really expensive and if you have them more tight in the memory less variables means it's going to be more tight you will see the data structure we used it's more tight so we have a better chance of hitting the cache and better chance of hitting the cache sounds not so interesting except that you realize that the modern cache is modern CPUs run so extremely fast that if something is not in the cache then you might have to wait 100 clock cycles or more to get the data into the cache so imagine if I could be 100 times faster I mean that's ridiculous imagine if somebody is going on a highway and you're going with like 100 and somebody goes by by 10,000 km an hour and you're like what the hell just happened and that's basically the difference and of course people are like this is so much faster well yes if you got everything in cache it's a lot faster alright so this is the DPLL procedure so this is more or less the same as before if you think about it you start the same way you see but you actually simplify with the variable that you pick so you say I'm going to pick X X is equal to true let's say and I'm going to put X is equal to true simplify the formula try again and do this recursively and to simplify the formula is what we call BCP which is Boolean constraint propagation where you just substitute into the formula variable X is equal to zero and see what happens like of course if there's a close you know Y is equal to zero then you also put Y is equal to zero and simplify and then do this until fixed point this is called BCP and then once you've simplified everything and you're still not solved like okay well let's pick another variable I don't know V10 and then you said V10 is equal to zero let's see what happens and then you do this recursively and eventually you'll find a solution so this is a very simple search procedure this is not what soft solvers do but not very far off actually so here's one I'm just going to talk a little bit about this particular one here so you see that we took a decision A is equal to a true and B is equal to true and now BCP kicks in which is to say is A B is equal to true and B is equal to true then this close here at the top the blue one will force C to be zero because this close can only be true if the C is zero you see that yeah so one thing that I want to talk about this is that there's different types of soft solvers and one of them is called lookahead which isn't so popular anymore but it's actually a very interesting solver and it does just that so it does exactly this what you see here and it obviously has very smart heuristics to pick the decision variable because it makes a big difference what decision variable you pick and it does a lot of preprocessing on every every branch so for example it takes A is equal to one here it will do tons of preprocessing so tons of logic on this particular node in the graph to see what can be simplified and then of course it will then try all the different kind of decisions that it could make and pick the one that's the best and that does this eventually exhausting the decision tree of course some of the things we just stop working like it might it might be that B is equal if you set B is equal to zero here it immediately stops because the preprocessing tells you that this is an empty formula sorry this formula contains an empty close and then you're like okay well there's nothing there there's no solution there and of course it will also stop if it finds a solution at any point so this is what's called a lookahead solver it's kind of dumb but actually it's very smart it's just smart in different ways and we are dumb in different ways it's a different tradeoff South solvers do very little nodes in this graph basically they're completely blind with every single node but they do it really really fast the lookahead solvers reverse this and basically say well I'm going to be really slow but I'm going to do the right thing I'm going to branch on the right variables every single time whereas the South solvers are like well it was a bad branch who cares I'm going to start again I'm going to reverse all of this and it's going to be quick so it's a different tradeoff but I'm not going to talk about lookahead solvers this is about CDCL solvers this talk is about CDCL solvers and CDCLT right so the CDCL concept was originally implemented in the context of grasp and the solver and it is basically learning what's called a no good but in a smart way no good is let's say that you went down here you did two decisions here A is equal to 1, B is equal to 1 so the original no good actually this is by Stolman funnily enough this is the only paper he ever wrote is if you took two decisions A is equal to 1, B is equal to 1 and it failed like it doesn't work but then obviously you can add the close not A or not B right because that's you together cannot work so you can add this close and this is what's called a no good and basically you just add these closes and every single like when you go down a decision and things fail you add the no good and it's just what we now call actually the decision close because in soft solvers we don't use this too often but they started being used again so it's one of those magical things about soft like people forget things 10 years ago and then they come back to it and like ah it actually works let's do it again so I actually use this no good idea when the decision number of decisions that I've taken is very few like let's say three or four and it failed then of course you can just add the close that negates all those decisions as a close and say well you know that set of decisions is definitely wrong so I can cut the search space next time by adding this no good and instead of doing no goods though normally we do something called first implication points first UIP I will actually not talk about it in this lecture I decided not to because it's quite a it just confuses everyone and in the end of the day it's not that important to understand how soft solvers work you can read about it in different blogs and in different papers and it's it would confuse you more than it would help you but the point is our first unique implication point is that you can actually resolve the final closes that took part in this in this conflict so if you have a look at this conflict this conflict was two closes took part of it this one here at the top because it forced this propagation of the minus C sorry of the minus C so the C is equal to zero was forced by this close at the top because A was one and B was one and so not C must have been set and then the other close that took part in this issue here that there's no more solutions here is this other close which basically says not A not B or C and that's not possible because if you have a look at that not A so that's zero, not B that's zero and C that's also zero okay so this close is falsified we need to do something about it and now the two closes that took part in this are the blue and the red one and if you resolve them over C what you suddenly get is not A or not B and basically the first UIP does a magical thing that takes into account all the closes that took part in this conflict resolves them of course you can resolve them in different orders and you can resolve them at different point and this is the reason why it's called first unique implication point because that tells you when do you stop resolving them basically when do you stop the resolution on the closes and you save that close as a memory of this conflict that you had here so never to go back to that same place again to bend that part of the search space let me actually yeah so there's some things that I would talk about but I want to first so this is a an example a search tree and what happens here is that the solver starts right up there it does all these branches you see these little branches that are going left left left and it's like this thing that's the BCP that's the booty constraint propagation where I'm substituting the literal the variables into the closes and things are popping out of it then there's more more more branches and then I go down here at the very bottom you see there's one more branch and then there's a solution all the way down here and this thing here at the very bottom that's the the learned close so that's the the resolvent of all the the closes that were that were leading to this place to remember never to go back to the same place again and what we're going to do is we just go back and of course we flip the we flip the decision and then we go again and we flip the decision and things don't work out and don't work out and actually here we go back all the way to right there and we we flip that one and now we're going to go right all the way here all the way here we go flip flip flip flip flip flip flip flip and eventually that's the solution right there so that's that's one visualized way of visualizing a that's over now maybe what you want to take away from this other than the pretty graphics is is we're missing a chunk here we're completely mixing this chunk so what some so I mean this is like a blow up here but like you see that there's nothing here like somehow we early abort here you said these these all these places we early abort we don't go all the way down here we early abort right up there and the reason why we early abort is that the the closes of course you know that we started replacing the variables inside this thing that with the values that we know and there was a there was an empty close that popped up and this is basically the difference between something like a brute force engine that you know evaluates the function all the way to the top every time so it starts at the beginning you know puts all the values in you know evaluate it to the to the end always not good let's go back up you know try the other combination and of course there's usually two to five and combinations because it's a brute force algorithm and try again and you can try again and try again and basically you'd have like a massive black box here right that's what you'd have because you try you do everything right but here what's happening is that you sort of keep track of where you are and there's quite a little update you know that you have to do that you skip a good chunk of the search space you completely skip that part of the search space so this is the difference between a SAT solver and for example when you try to do Bitcoin mining and you do this either if you know how it works so you have to basically it goes through every combination of of an input to a hash function and and verifies if the output matches a certain pattern and if it doesn't then goes from the beginning and just like okay well then let's try another combination and try another combination goes to the end but they're not good let's go back to this whole thing like it's really stupid but if you think about it it's a trade-off because you don't have to do all this data structures and understanding where you are and then deriving conflicts and all these like small things you don't have to do right instead you can just like put it into a let's say a graphics GPU a GP GPU system and like evaluate it on like I don't know a thousand stream cores at the same time in parallel you know bit-slides if you're fun doing that sort of stuff and then it's really fast but it's not it's not very smart it will not early abort it will go all the way to the end every time you know sees if it doesn't match the pattern and it goes back to the beginning tries a different combination and does this all the time so that's kind of the difference between a satsuran and these these brute force engines and one of them goes to the whole damn search space like from the beginning to the end without thinking but of course not thinking is an advantage it means that you don't have to do all these data structures and you don't have to do all this checking boom you know but sometimes that works really well for example for compact problems in cryptography you know because it only has the number variables is let's say you know 50 or 80 and then you can do this but here if you have you know thousands of variables you cannot possibly do that because it will never finish so now you have to do some different search algorithm so maybe we won't go through all of this because we'll be tight on time so okay so this is where we were and I explained how this this one in particular worked this is basically just the same thing and we keep on deriving this new new closes that are being being generated we can step through it it's not such a big deal so here you know these two gets resolved and you learn the resolvent of those two closes and now you go back up and you're gonna take a decision of what a is equal to one again and now suddenly b propagates because we use this this close to propagate b you see that a is equal to one we replace a in here you know b zero is set and then once a is one and b is zero then suddenly you know this one the green one here is propagating because a is one so this not a is zero b is zero so this is zero which means that not c propagates so c is equal to zero but suddenly we reach again a conflict because now we have a is one b is zero c is zero but that means that this is zero zero zero so now all these closes we're part of the the the conflict right all the all the colored closes are part of this conflict and we're gonna derive a new close which is going to be c and then now we're gonna do this the same thing again and if we keep on doing this and eventually we will learn the empty close and we're done and this is how this particular one works and that the coloring of these different closes is basically just trying to match you see the the different why the different things happen and if you start resolving them you'll eventually arrive at the empty at the empty close I'll not have time for this and neither for this what I will talk about instead is something called back jumping which is what you do is that when you derive the conflict and you derive a close a resolvent of all the closes that not all but some of the closes that led to this particular conflict then sometimes that close will actually be implying that not only the last decision you made was wrong but actually a decision previous to it was wrong so it can actually say something stronger than just hey reverse the previous decision which was the DPLL procedure and if you remember you just go left and right left and right left and right like a full search here you can actually derive a conflict that says the entire branch here here is wrong don't bother with it you can go back all the way to reversing X so we jump over this part you see the the one that's cut here with the blue line is completely cut and this is called back jumping so when you do this resolution based derivation of the the essence of why you had a conflict why things didn't work out you can actually it can happen that you don't you can do more than just reversing the last decision you can reverse this decision or you can even reverse a decision further up the chain alright and there are different there's a heretics for picking of course the variables one that is very well known and it's always used nowadays is what's called evisits so exponential visits which is basically tracking which variables appear in these conflicts so it turns out these conflicts are a really good indicator for which variables are playing a role in your search and which of the variables are most likely to be problematic because they are the ones that you want to search on want a branch on first and so basically you do a search that is going to be focused on variables that are recently in derived closes and the emphasis here is recently and of course the question is how recently do you want like the last variable that was that appeared would you want that and that's called the VMTF variable move to front strategy where you're just like oh this was the last variable that you saw well let's move it to the front that's the next thing we're going to branch on if it's available I mean if it's not available I cannot branch on it because it's already set then okay but if it is available I'm going to branch on this first so that's like this exponential this viscid this variable branching strategy with like a crazy heuristic where like that's the last variable you saw poof that's the one I'm going to branch on that one and a more like sort of maybe a more sane or maybe it's not a matter of sanity actually it's a matter of whether it works or not what you do is that you basically assign scores to variables and add this core to a variable if you recently saw it and you just keep on adding this core every time so that's just linear you just add this core every time or you can do an exponential and you say well if I if I see this variable then I add this core and every other score I make smaller by a factor of 0.8 and then the next variable I see again in the conflict I add the same score to it and make everybody else 0.8 you know the score of it 0.8 times smaller and you do this all the time and then obviously the more recent something is the more likely you're going to pick next time because it's just an order it's like a heap you know it's a data structure is actually a heap and then you keep on picking the ones that you most like most recently I mean if you saw a variable 10 times recently and not the last time then you're still going to pick it whereas the variable front strategy which is this you know you saw it recently that's it that's the one we're going to pick it doesn't like the history is completely raised of like what you saw before is just what happened just right now and actually before all this and I'm sure at the top you see that actually used to be something that didn't actually deal with variables instead it dealt that with literals but Minisat came in and basically said well all this literal based decision heuristic which doesn't branch based on a variable but a branch is based on the score of a literal was effectively completely replaced with this variable based decision heuristic and everybody's now using variable based decision heuristics so they they say okay I'm going to branch on V variable V and then I'm going to set it to some value and the the value that we're going to set to for example in Minisat is 0 it's a very good heuristic is 0 the answer is always 0 so basically he wants to branch on V where he's going to set V to 0 and if it doesn't work out of course it's going to flip it to 1 it's not like it's never going to set it to 1 but when it first sets it it's going to set it to 0 and then that was changed and now it's something called polarity caching where if you last time you branched on it your first time you branched you set it to 0 and when you then the variable gets assigned for example it gets assigned to 1 then next time you need to branch on it you're going to set it to 1 but if last time it was set to 0 then you're going to branch it to 0 so just the last same value of the variable is the one that we're going to branch on this is called polarity caching and it works really really well because it means that every time you search you're going to find yourself in the place that you were last time more or less except that of course it might be different like you might have started the decision order might have been different because of course the heuristic that I talked about here might be different at a different time point in time so the activity of variables the recentness of variable in the queue might be different and so it might be the order of branching is different but you arrive at the same point in the search space and I mean of course here's the look ahead sorry the look ahead solver thing that I mentioned where of course the what variables do you decide on is substantially different the way it does the decision it doesn't do a queue or some heap or some scores it actually individually checks every single variable that it possibly could check could branch on and evaluates how good it is it's extremely expensive so every single branch is going to take you minutes or something I mean if it's a really large problem but you know correct it's very fast it's going to branch the right way and there are some systems that actually do a version of the two so doing both at the same time you do like very slow and and expensive decision at the top of your search tree and then you do really really quick at the bottom so you basically do one type of solver at the top and a different type of solver at the bottom I'm not going to talk about this but it's a loop so basically this is how most solvers work when they talk about CDCR so first we're going to check if it's unsatisfiable if it's unsatisfiable you know off we go we're done if not then we're going to propagate we're going to replace all the values the current values into all the closes and see if anything pops out you know if there's a close that suddenly becomes A is equal to zero then you know we set A is equal to zero and we do this and then if we're satisfied all the closes are satisfied then we're done if not we're going to decide that's it, it's quite easy that's the good thing and of course if your propagation didn't work so when you were propagating you were replacing the things and you came out with an empty close so you were replacing the variables with the current assignment and turns out that a close is not satisfied then you do this analysis then you do this analysis to resolve the closes together to find a conflict close that describes why you ended up in this place and this is the graph that I just showed you where it does exactly that and I think I'm going to have maybe two more slides and then we can all enjoy our little break first is reducing learning closes so you derive all these and you do all these analysis and you get all these closes and explain why you ended up in this place and it's not a great place to be because it means that now you grow your set of closes all the time every single time you do a conflict you have one more close and imagine that modern start solvers easily do millions of conflicts and you don't want to add a million closes to your database because that's just going to slow you down so you have to reduce them and you have to remove some of them and there are different heuristics for removing them some of them used to be like well if it's short we're going to keep it if it's long we're not going to keep it better is that you use different kind of heuristics like how active is this close is it being used, is it actually useful am I doing anything with this close or is it just there and doing nothing using the memory and then this LBD is sort of a new heuristic sort of because it's 10 years old and it's computing a static value that based on the resolutions that you made and the decision that you're at in your search tree and if this static value is small than a certain value then you keep it and if not then you throw it away eventually and then Kodip and Raghav and myself have been working on one that actually computes, auto computes this heuristic based on machine learning and proof traces and a bunch of other things to basically derive a heuristic based on the actual running of the SAT solver and finally the last little ingredient into our SATSoup is what's called restarts where you have been searching for a while you can actually say well let's abandon this search part and start completely new, start again now the thing if you think about it if you do this with DPLL so the original search algorithm where you go left, right left, right, left, right if you start from the beginning you just throw away all the work that you ever did you start from the beginning, left, right, left, right so I have to do again everything from the beginning the DPLL search procedure is a search procedure that just does all the way and the way it derives unsatisfiabilities is that I have searched all the search space there's nothing there, you know now I'm done but here we don't actually do that what we do instead is we keep these learn closes as memories for the things that didn't work out and actually the learn closes are the key the search is just directing us to the learn closes so I have this I mean you can look at set solvers as search directed resolution systems where basically all you're doing is you're pretending to be searching but actually what you're doing is looking for the resolvance the little things at the bottom so you're just fishing for this stuff at the bottom here that's what you're really looking for that's your memory of a place that you couldn't go back to because there's nothing there that part of the search space is empty this encodes this basically and these are resolvance so you're effectively building a proof of unsatisfiability through these little memories of the things that didn't work out and that's the reason why you can restart because it doesn't matter where we were we can just restart again do the search again do the solutions do more of these memories the little memory pieces that didn't work out and eventually through these little resolvance these conflict closes we're going to build a proof so the whole thing is reversed it looks like it's doing a search but actually it's doing it's actually building a proof and it's even building a proof because it finds a satisfying assignment here but what it actually did was build a proof that said hey all the other places don't bother with it, there's nothing there I'm going to build a proof that there's nothing there and eventually hope that the search will find the one solution that does exist and if there is no solution then you just hope that the resolution proof will just be good enough, we'll get there so the whole thing with the search is actually it's the trick that the devil pulled because it looks like it should distract you into search but it's actually the proof that we're doing that's the core of all self solving in a sense and if the proof that you're building is not possible to build because you can actually in some cases for example for the pigeonhole principle the proof itself is extremely large then you will know that this thing will never finish it will actually never finish I mean it will finish theoretically speaking but the sun might go into a wide dwarf and you'll still be waiting for the thing to finish and that's when you realize okay it really is the proof it's not the search it's just a search directed proof engine and that's basically where we're going to pick up next in this talk