# SAT Problem Solved
*by Javantea*
*Aug 14, 2015*
I solved my first SAT problem and in doing so, I had to write a valuable improvement to [satispy](https://github.com/netom/satispy), a simple and powerful Python frontend to [minisat](http://minisat.se/) and [lingeling](http://fmv.jku.at/lingeling/). I'm making a pull request to netom, the author of satispy and I am [publishing my code on Github](https://github.com/Javantea/satispy) since the original was published on Github. Allow me to show you the problem I was working on and the possibly elegant solution.
```python
def f1(x):
v = x ^ 32
w = v & 99
return w ^ x
#end def f1(x)
```
The function f1(x) is a trivially simple function that behaves so simply that a person could easily guess the approximate output. However, it's complicated enough that if I asked you whether 0x3333 could be produced as an output, you'd have to test a few thousand runs, right? This is a problem that brute force shouldn't have to solve. Either it's satisfiable or it's not. If it's satisfiable, we should be able to present a solution in a reasonable amount of time. Note that 0x3333 is 16 bits long, which means that brute force could be very expensive. Using algebra to limit the range that you have to brute force works on this problem, but what if I made the problem 64-bit (instead of Python's multi precision integer implementation) and made the constants 64-bit variables. Would you be able to solve this in a reasonable amount of time using algebra and brute force? I don't know, but let's find out if a [SAT solver](https://en.wikipedia.org/wiki/Boolean_satisfiability_problem) can solve this first.
Writing [DIMACS CNF]() (or even CNF) by hand is a terrible experience. Even when you're working on a handful of variables it gets messy. A 16-bit problem like f1 ends up with 64 variables as 272 clauses which are not easy to compute in the slightest. Enter satispy. Use similar algebraic operations as the original code and now we have a pretty amazing SAT system. First of all, remember that SAT is about boolean, so when you have a 16-bit variable, you have to deal with 16 variables. But with python, that's just a for loop. I could have created a class, but I'm not doing that just yet. I want a solution, not a framework to solve anything in n-bits right now.
```python
bits = 16
vars_x = [satispy.Variable('x%i' % i) for i in range(bits)]
#For this to work... we need to set v and w's constants.
k = 32
vars_k = [satispy.Variable('k%i' % i) for i in range(bits)]
exp_k = satispy.Cnf()
for i in range(bits):
if k & (1<* 1: expected_output = int(sys.argv[1])
vars_cmp = [satispy.Variable('cmp%i' % i) for i in range(bits)]
exp_cmp = satispy.Cnf()
for i in range(bits):
if expected_output & (1<** 0 and len(other.dis) > 0:
new_dis = []
for d1, d2 in [(d1,d2) for d1 in self.dis for d2 in other.dis]:
d3 = d1 | d2
new_dis.append(d3)
elif len(self.dis) == 0:
new_dis = other.dis
else:
new_dis = self.dis
c = Cnf()
c.dis = new_dis
return c
def __xor__(self, other):
return (self | other) & (-self | -other)
def __neg__(self):
cnfs = []
for d in self.dis:
c = Cnf()
for v in d:
c.dis.append(frozenset([-v]))
cnfs.append(c)
ret = cnfs.pop()
for cnf in cnfs:
ret |= cnf
return ret
def __rshift__(self, other): # implies
return -self | other
def __str__(self):
ret = []
for d in self.dis:
ret.append(" | ".join(map(str,d)))
return "(" + ") & (".join(ret) + ")"
def __eq__(self, other):
return self.dis == other.dis
def __hash__(self):
return hash(self.dis)
```
The `__neg__` function loops over the clauses and then over each variable in the clause creating a new Cnf object with the inverted variable as a clause. It puts those Cnf objects into a list. Then it ors them together. Surprisingly, the for double loop was not the issue. It easily gets past that. The problem comes in the or function. Let's look at `__or__` now.
```python
for d1, d2 in [(d1,d2) for d1 in self.dis for d2 in other.dis]:
d3 = d1 | d2
```
That's the problem. How do we go about solving such a difficult problem though? It seems almost unsolvable from the perspective of how do I reduce these functions. The answer is to look at the logic. Take a look at this.
```python
import satispy
try:
satispy.cnf.cnfClass = satispy.NaiveCnf
except:
pass
#end try
a = satispy.Variable('a')
b = satispy.Variable('b')
c = satispy.Variable('c')
d = (((a ^ b) & c) ^ a)
print(str(d))
```
Output:
```python
(a | b) & (-a | a | -b) & (a | c) & (-a | a | -c) & (-a | b | -c) & (-a | a | -c | -b) & (-a | b | -c | -b)
```
As you can see ((a ^ b) & c) ^ a) produces a pretty small output. But look at the clauses for a moment.
```python
(a | b) & # Fine
(-a | a | -b) & # Not fine
(a | c) & # Fine
(-a | a | -c) & # Not fine
(-a | b | -c) & # Fine
(-a | a | -c | -b) & # Not fine
(-a | b | -c | -b) # Not fine
```
Look closely at `(-a | a | -b)`. Do you see any logical flaw here? `-a | a` is a worthless statement. You can just replace it with True. Thus we get `(True | -b)`. But that is a worthless statement as well because `True | x` is _always_ `True`. Allow me to illustrate for the non-programmers. Either I am a person or I am not a person or you are an automobile. It doesn't matter whether you are an automobile or not, because the statement has already said "I am a person" or "I am not a person" which is certain to be true. Therefore the entire statement can be removed. The way that CNF works is that you have clauses anded together. Each clause is a set of variables or negated variables or'ed together. This very strict regular format means that all logic has to be converted into this format. It also means that when you naively convert (like satispy does) you end up with massive illogical statements. I wrote a very small function called reduceCnf that filters out any clause that has (v | -v) in it. For example, running `reduceCnf(d)` on our d above gives us the following:
```python
str(satispy.reduceCnf(d))
'(a | b) & (a | c) & (-a | b | -c)'
```
As you can see a 7 clause CNF now becomes a 3 clause CNF. Using a modified negate and or that uses the reduceCnf function liberally, we are able to run the [easy_problem1.py](/uploads/easy_problem1.py) very quickly. It actually makes sense that instead of writing negate and or naively, we should instead write them in such a way to never allow redundancy because of the memory problem. Anyone using satispy on a serious SAT problem has probably run into this (I can't imagine someone not running into this).
SAT is not nearly as difficult as it appears. I recommend spending a full day working with the tools and getting a virtual environment set up to safely run any SAT solver that you may need.
Future work will of course involve writing an integer format and a constant format for satispy which will allow a user to write the following code:
```python
bits = 64
var_x = Int64Variable('x')
k = Constant64(32)
m = Constant64(99)
expected_output = Constant64(0x3334)
exp_eq = -((((var_x ^ k) & m) ^ var_x) ^ expected_output)
solution_cnf = k.exp & m.exp & exp_eq & expected_output.exp
print(len(solution_cnf.dis))
io = satispy.io.DimacsCnf()
open('easy_problem1a.cnf', 'w').write(io.tostring(solution_cnf))
solver = satispy.solver.Minisat()
res = solver.solve(solution_cnf)
if res.success:
print('\n'.join(['%s %r' % (v.name, res[v]) for v in vars_x]))
x = int(''.join([str(int(res[v])) for v in vars_x][::-1]), 2)
print('x =', x)
elif res.error:
print(res.error)
else:
print("Not satisfiable.")
#end if
```
I am very interested to hear what people have to say about my research. Feel free to e-mail me ([please use PGP/GPG](https://www.altsci.com/gpg.html)), [tweet](https://twitter.com/Javantea), or phone me. I am currently looking for an job or internship at a Nanotechnology research lab. Let me know if you can introduce me to someone in the field.
Off topic: I wrote this in Markdown and used [pandoc](http://pandoc.org/) to convert it to html. It looks nice, doesn't it? It also supports a lot more formats than HTML5. The [markdown version is here](/uploads/easy_problem1.md).
Javantea out.
*