 Saurabh Joshi is currently working at IIT Hyderabad. His research interests are formal methods, program analysis, constant solving. There are many tools which came out of his research work like OpenWO, PINACA and these tools have won many accolades at events like SVCom, MACSAT, valuations, etc. So, this is a joint work with two of my undergraduate students, Prateek and Sukrut and Ruben from Carnegie Mellon University. So, one primary disclaimer usually the theory people that you know whenever, so approximation here I do not mean approximation algorithm sense. So, there is no approximation factor here. It is just a very loose way in which I am using the word approximation something like you know 343 divided by 341 is almost 1. So, something like that right. So, let us get right into this thing. Techniques are very simple, but as I think Venkatesh mentioned today in the morning that in the solvers space very simple things you change little bit here and there and you may get a lot of advantage or sometimes the other way round. So, here I will know. So, let us get right into it and since this is about using applications of SAT and SMT. So, one of very good application of SAT solver is maximum satisfiability. So, we are going to use SAT solver as a black box for solving MACSAT. So, one of the words for example, what do I mean by incomplete MACSAT? So, MACSAT is an optimization problem where we want to know that what are the maximum set of maximum number of clauses I can satisfy. If the clauses the formula is unsatisfiable then you want to hit some optimum number that give me an assignment will satisfy the maximum number of clauses. Incomplete MACSAT means that incomplete MACSAT solvers or incomplete algorithms do not guarantee the optimality of the answer. But the motivation is again I mean I will refer to Venkatesh's talk in the morning. Motivation is that you want to find a good enough solution in a very, very short amount of time. So, there is the idea behind this thing. And in fact, since last few years. So, in the MACSAT evaluations what has been happening is that usually people just you know use complete algorithms even in the incomplete track. So, it is like the same solver and they have been winning. So, last few years the organizers have changed the benchmarks such that no complete solver will be able to find the solution within the given time limit. So, benchmarks are like very huge and what not. So, forcing people to come up with like fundamental incomplete strategies and algorithm. So, this is our contribution. So, this got published in CP 2018 and the extended version with the theoretical analysis went into journal JSAT in 2019. So, now let us move on. So, this is a formula and this is all four combinations of literal X1 and X2. So, it is easy to see that this is not satisfiable. This is unset. Since the set solver gives me answer in binary that either I can I mean is that it is satisfiable or not satisfiable. So, how do I use it to solve an optimization problem? Right. So, the way this is a very clever thing that most of the MACSAT solvers do is that you introduce a new set of variables R1, R2, R3, R4. Many times it is called relaxation variables. So, you relax a clause. So, these are a new set of variables. And now you know that this formula is satisfiable. Right. But the thing is that this is not the original formula. So, now you add this additional constraint. You say that you satisfy the formula, but use the least number of RI's to satisfy the formula. Right. So, when RI is 1 that means that the original constraint was not satisfied. When RI is 0 that means the original constraint was satisfied. Right. So, that is the idea. So, this is called a cardinality constraint over here. And now here our idea is that for this K you want to minimize K. So, use least number of RI's as least as possible as less as possible to satisfy this formula. Now many times for lot of practical applications there are certain some constraints are more important than other constraints. And for that basically you have weights attached to these every constraints. So, all of these clause. And now the idea is that you want to minimize the cost of this thing. So, if you could not satisfy a constraint then the corresponding weight is added to the cost. So, that gives rise to for the constraints like this. So, this formulation remains the same. But now what you have is a pseudo Boolean constraint where your RI remember is a propositional variable. This is 0, 1. But your constraints are positive integers. So, you have weights now. And now again is the same thing that you want to minimize K over here because you want to minimize the cost. And yeah, it is important because it has lot of applications. In fact, in CP 2015 we took some benchmark from computational biology and showed that certain kinds of problems, certain encodings work very well. And there is very nice paper by Rupak Majum there for localization and there are several uses. So, this is the motivation that what you want to do is that you have very limited amount of time just give me a good solution and forget about if you do not hit the optimum that is okay. And that is where the incomplete solver comes into play. Yeah, that is a good question. So, I will discuss later that what good means, the measure of goodness. It comes later in the talk. So, we propose these two techniques. This was part of our CP 2018 paper. So, now let me just explain this encoding. This was in 2015. It is a very easy encoding. So, let us say that I want to encode this summation. So, this weighted summation I want to encode. You want to capture the summation. So, the way you do is that for every literal, so this is the literal has weight 2, weight 3, weight 3, weight 3. You make some kind of a binary tree out of this. Now you want to capture the running sum that depending upon literal was true or false. What was the total sum? So, one, for example, if L1, if L1 is true, so this clause is going to force the corresponding A2 to be true. So, here for the A2, all these internal nodes the subscript means the weight. That means that yes, I have witnessed that weight of 2 is sort of the corresponding literal is true. So, this is a school. So, please feel free to ask any questions. So, very simple encoding. Similarly, I will say that if L2 is true, then it is going to force A3 to become true. That means that registering that I have witnessed the weight of 3. And then you can say if both of them are true, then 2 plus 3 is equal to 5. So, you have these constraints like this. And this way you do it for the entire tree. So, this is your encoding, the sigma wiri, you remember. So, this is my encoding. Now, the thing is the upper bound. So, how do you ensure that this? Pardon? Both of them force it does not matter, right? Because I want to do upper bounding. So, let us say that if it is the if I want to encode greater or equals constraint, the things might change. But let us say for example, if I just want the at most constraints, then I do not. So, if they are 0, it is good for me. Only thing I want to ensure that they do not overflow this bound, this upper bound. So, now the threshold I can do because now all these sums all the way propagates up to the top level. And now all I do is that I say that negation 06, negation 08, etc. So, that means that whenever the sum is going to cross 5, it is going to conflict with these unit clauses. So, this encoding ensures that you will. So, this is satisfiable if this formula is satisfiable only if this constraint holds, right? Okay. So, now the problem with this thing is that this is the worst cases exponential because you remember these set of variables in the clauses for every different combinations of weight, you have to have one variable, right? But if I have weights like this like 1, 2, 4, 8, etc. That means that every combination is going to generate a unique weight. That means that I have to have another variable. So, you can have exponentially many variables in the tree and corresponding exponentially many clauses. But if it is becomes polynomial all the weights are same because that means that because even if you pick different literals, the total weight is still the same. For example, if all of them are 1 or all of them are 2, right? Then it becomes it is quadratic actually this encoding. So, we want to the intuition is that you want to leverage this idea that all the weights are same, then it is polynomial. And this is another extreme that if you know things like this, then it is exponential. So, what you want to find out is that somehow you want to adjust your weights so that they become same. So, that is what we are going to do that because the weights are same then becomes polynomial because some of the problems that we have had it does not even fit in the memory it goes mem out our solver actually mems out. So, this is where we want to strike a balance for this thing. So, for this algorithm we have a parameter m which is that how many clusters of these different weights we want to form. So, let us say that I am going to run an example for m is equal to 3. So, first so these are all let us say different weights of all the clauses. So, first what you do is that you sort them by the weight and now I want to put them into three clusters. So, one of the way that one can do is that initially I put everything in one cluster then I keep dividing cluster so since this is sorted so I can say that basically the subsequent numbers what is the difference between two consecutive numbers. So, whenever this difference is highest you put a cluster boundary there. So, for example, 25 minus 12 is 13 which is the highest the subsequent difference. So, between any ai and ai plus 1. So, you put a cluster boundary there and then you can keep on doing this. So, since now I have three clusters I stop there because this is my parameter I have achieved. Now that I have formed the class see the idea behind dividing a cluster at this boundary because eventually you want to do weight adjustment. So, for example, after I do this I change all the weights to arithmetic mean of the corresponding cluster and that is why I have to make cluster so that this is to ensure that when I make the weight adjustment it is not too much. So, that is why when the difference is too much you try to put them in a different cluster. So, that is the intuition. So, again I mean you can pick either median or people can debate that what but this is what we implemented but in general you can put that what is the representative weight of the cluster you want to put over here. So, now you see that now because these are all equal you know that at least for this cluster this is going to be polynomial this is polynomial this is polynomial and so now there is a tradeoff. So, obviously depends so if you keep increasing n. So, now what I do is that now this was the original problem right but now I have changed my weights some completely different weights now I am trying to solve the problem. So, what I try to do is that now I minimize k. So, I ask that whether for this k it is possible then I reduce k for this k it is possible or not. But because I only encode this thing my polynomial is I am hoping that it is of manageable size and you keep decreasing until you reach unset. And even if for example you reach unset that does not mean that you have reached optimal because you are not even solving the original problem you are solving some different problem. But the assignment that I need at the end when I print in fact the thing is that even if I am reducing k over here that may not mean that the this k corresponding to this formula is monotonically decreasing it does not even mean that. So, that is why I internally whenever I find an assignment for this you keep track of the what is the corresponding cost in the original formulation and then you record that assignment because that is easy to do I do not have to encode it right ok. So, it is sort of easy to see that as I increase my m which is the number of cluster my accuracy sort of increases because my weight adjustment is not too much yes. So, it gives me an assignment right. So, now with that assignment I see that that assignment gives me how much cost in this formulation because I can compute the cost since it is 0 and 1 I can see that what is the weight of all when all the arrays are 1. And I keep track that which is the smallest cost I have achieved so far this is with respect to original one because this is what you really want yes when the modify the one is unset then you stop. So, if you still have more time so the thing is that incomplete solving because in the competition there is a limited time like 60 second and 300 seconds and whatnot. So, depends upon so if you have time then what you can do is that in this particular case in this algorithm what you can do is that you can actually increase your m to increase the accuracy and do that. From the last value of k that you know it was satisfiable because now with the modified m it is possible it again becomes satisfiable. And then again you go if you reach unset if you still have time you again increase m and yes. What was the strategy it could go back. So, if it is sad you use that assignment to find the. Find the what is the cost yes with respect to the original formula. And if that cost is lower than what I recorded if for example some earlier assignment then I keep that assignment because at every point of time I want to know that what is the lowest cost assignment I have found so far. Yes it does not mean anything yeah. So, in fact because the thing is because we have taken average. So, if you take a max or min then you have a one sided guarantee that if one is unset then so something like that. But if you take arithmetic mean that you do not have any guarantee whether this is the upper bound or the lower bound with respect to the original problem. But you record the cost with respect to the original one for every assignment that this guy gives me. And when you hit unset now if you have time you can what you can do is that you can increase this m and start from the last k that was satisfiable and ask that whether now it is satisfiable. If it is not again increase and it is sort of also easy to see that basically when m hits the total number of unique weights. So, and there can be 1000 clauses but the weights are only 2 different then only m is equal to 2 after that there is no approximation you are doing solving the original problem exactly. So, you can keep on increasing if you have time then you can keep on increasing m until this limit and but also as you increase m the size of the formula also increases. So, that is the tradeoff that you have accuracy versus formula size tradeoff that you have. So, this is the first sort of strategy that we used. Another one is so look at these weights. So, let us say that you have weight 111 for these clauses. The property that it has is that at every level a single weight in this level is greater than strictly greater than all the weights combined at a level below and so on and so forth. So, this is 16 and 319. So, this is 20. So, let us call this property a Boolean multi-level optimization property BMO property. Now, what I can assume for a moment that your formula has this property the weighted formula have this property. If you have that property then what I can do is the following. So, now the thing is that I know that set is basically you know making this single RI false is much better than you know trying to make all of these 0 because the all the weights combined of level below is still less than my weight at a given level. So, now what I can do is that I start with the highest level and then I impose this cardinality constraint. So, this is always polynomial because the weights are all same. So, I ask that whether it is possible that you know you set only two of them to true. If that says that yes it is possible then you reduce only for that level and then if this says unset then now you freeze this constraint over here saying that because now you cannot do any better. Now, you move on to the next level below and now you ask that whether since there are four of them you ask that whether you can set only three of them and make the formula satisfiable. If yes then you keep going and whenever you hit unset you freeze the last one which was set and you just keep on. Now, this is kind of a greedy strategy that you basically you want to minimize the cost from the top level onwards and it will produce optimal if the BMO property holds which is kind of obvious because the weight of this guy alone is greater than all of these combined and so on and so forth at every level. But if this property does not hold then there is no such guarantee. I mean you can do that but you do not know how much you are going to deviate. But the thing is that let us say that your formula does not have that property let us implement this algorithm anyway and see that what happens and how much it deviates. So, yeah there is no guarantee that it will converge. And here also for example after you are done with all the levels and if you still have time left on the clock then what you can do is that then you can switch to a complete search algorithm. So, this is an incomplete algorithm by default because even if you terminate it is not guarantee that this is the optimum. You can switch to a complete search algorithm or a local search algorithm. So, yeah so now the measure of goodness the way we do and this is the standard in the competition is that for all the benchmark you find out that for a given benchmark what was the best solution known for that benchmark either by this solver or some from some prior knowledge etc. So, that is called the best of me. So, this is the remember we want to minimize the cost. So, this answer is always going to be greater or equals this one. That means that score for an individual benchmark is somewhere between 0 and 1. You sum them up and divide by the total number of benchmark. So, that is the score and now here the goal is that you want to improve the score you want to be as close to 1 as possible. So, that is the so if you already know the actual answer then that is going to be the best of me otherwise. Yeah, so this is how the scoring is done and we have done experiments for this three particular time out now and the score of 0 if solver fails for some reason. So, in the in the first strategy you see that basically so when you do not apply so m is equal to 0 that means that you do not do any clustering. So, this is the I think ratio of I think it is the summation of entire benchmark that how many total new clauses have been added as compared to the original one. So, it goes 3000 times the original ones. But when the cluster is 1 so now there is the you know so as m increases the formula size increase this is what it shows and this is the score. So, so you can see that as the m increases the accuracy kinds of increase. So, but you will see a little bit decrease over here because also when m increases the formula size increase and you may time out before you can do. So, this is for the first one this is for the the second strategy where now the thing is that is in the second strategy only uses cardinality constraints. So, size is not a problem. So, you can keep on in fact the thing is that almost for all formula we have been m is equal to hash weights. So, all the way we can do but remember that in the second strategy even if you do not do clustering even then it is not guaranteed that you will come to an optimum. So, here you can see that the accuracy increases kind of monotonically the second strategy. And yeah we have compared with so this is an old slide that I am using. We have some new set of experiments but so Q max set was a complete solver which one in max set 2017 WPM. So, I think one was the first ranker, second ranker. Max roster was the winner of max set 2017 and these are the two of our work. And you can see that this guy is best in all these three time out whereas this one when you do the clustering it is doing good when the time is very less. But as you give more time the for example this max roster is doing better than this thing. There are lot more plots that I have in journals and plots but these are old slides and I just so yeah I mean so these are others. So, the same strategy we won for the first strategy we placed fourth in 60 second and 300 second time out in max set 2018. For the second strategy for 300 second we were second and for the 60 second we were first. For max set 2019 it was sixth and fourth the first strategy. Second strategy was placed third both of this time. Now even in max set 2019 that the first place holder which is TT open WBO Inc. It uses actually open WBO Inc our solver. They use the same algorithm and everything they change the variable selection heuristic and the bump activity and what not. So, those variable heuristics is the one that you change. So, we are the ones we are using set solver is a black box they are not using at a black box but it is the same algorithm same everything else is the same. So, in fact so yeah one thing to know for example in yeah so max set 2019 so we are switching to this so since most of the time we do not time out for the second strategy. So, we switch to a local search algorithm that means that we have an assignment and from that assignment you just look at one particular variable and keep flipping with some heuristics and see that whether that decreases your cost. So, there is a local search strategy. So, that combination is working very well and that is the same thing that even the TT open WBO Inc which is the winner of this year max set 2019 uses. So, this is the tool so open WBO is a complete solver open WBO Inc is an incomplete solver both of them are available out in the open. Now another part is that basically yes so far I mean I did not talk about that what are the guarantees of this thing. So, in the general version we make theoretical analysis about this it turns out that it is actually you can exponentially deviate from the optimal solution. In principle you can construct pathological formulas. So, if I n variables and even if I pick weights like 1, 2, 3, 4 like that so it is not like I am directly picking very high weights even then it is possible to construct formulas which can exponentially deviate from the optimal one because the you can do all this Boolean combinations of variables exponentially many Boolean combinations of variables it is possible to construct formulas like that. So, that we have shown so yeah theoretical guarantees are not there but at least in practice this is working very well. Generally in max set solver if you have hard clauses then the weight is infinite. So, do you find any like any difference in the evolution if there are like lots of hard clauses compared to the number of soft clauses? So, here because we are using set solver right. So, if it is hard clauses you do not relax those clauses. So, essentially if it is hard clauses you do not introduce this relaxation variable. So, any satisfying assignment by a set solver is forced to satisfy the original formula. So, this relaxation you do only for soft clause and then rest everything follows. But I agree that for example if it is a local search kind of scheme that is where it happens that if it is a hard clause usually you know you initialize your weights which is like very high weights and that is how the local search strategies work. Yes. Some of them could be or introduce variables R's right. Yeah exactly. So, what is the yeah. So, for example in in this year the winner Titi opened WBO Inc. So, they used the exact same our framework our algorithm and everything what they did is that see for example these variables. So, these variables what they called they call these target variables and whenever a solver has to make a make a choice on the polarity that basically when you are making a decision on these variables whether the polarity should be 0 or 1. So, if you do not mess with it usually the set solver takes whatever the last best known polarity. But here what they do is that they always say that basically initially the first choice that you should try for a decent variable is always 0. So, so they are being optimistic. So, that is one thing that they did. Second thing that they did is that for every variable of importance for example in the way seeds and all this kind of variable heuristic there is something that is called activity of a variable and whenever a variable is involved in a conflict it is called active. So, you bump an activity whenever. So, these variables are initialized with high a little bit high activity. So, that mean that the variable selection. So, it is going to give some bias to these variables. So, that is what they did in 2019. That is not our work that is from Intel, Alexander, Nandale. So, yeah I mean when you do not use it as a black box and when you have some domain knowledge it is good to actually go inside the solver and play around with these things. Thank you for the talk. Thank you.