Hash: SHA512

SAT Problem Solved

by Javantea
Aug 14, 2015

sat1-0.1.tar.xz [sig]

I solved my first SAT problem and in doing so, I had to write a valuable improvement to satispy, a simple and powerful Python frontend to minisat and lingeling. I'm making a pull request to netom, the author of satispy and I am publishing my code on Github since the original was published on Github. Allow me to show you the problem I was working on and the possibly elegant solution.

    def f1(x):
        v = x ^ 32
        w = v & 99
        return w ^ x
    #end def f1(x)

The function f1(x) is a trivially simple function that behaves so simply that a person could easily guess the approximate output. However, it's complicated enough that if I asked you whether 0x3333 could be produced as an output, you'd have to test a few thousand runs, right? This is a problem that brute force shouldn't have to solve. Either it's satisfiable or it's not. If it's satisfiable, we should be able to present a solution in a reasonable amount of time. Note that 0x3333 is 16 bits long, which means that brute force could be very expensive. Using algebra to limit the range that you have to brute force works on this problem, but what if I made the problem 64-bit (instead of Python's multi precision integer implementation) and made the constants 64-bit variables. Would you be able to solve this in a reasonable amount of time using algebra and brute force? I don't know, but let's find out if a SAT solver can solve this first.

Writing DIMACS CNF (or even CNF) by hand is a terrible experience. Even when you're working on a handful of variables it gets messy. A 16-bit problem like f1 ends up with 64 variables as 272 clauses which are not easy to compute in the slightest. Enter satispy. Use similar algebraic operations as the original code and now we have a pretty amazing SAT system. First of all, remember that SAT is about boolean, so when you have a 16-bit variable, you have to deal with 16 variables. But with Python, that's just a for loop. I could have created a class, but I'm not doing that just yet. I want a solution, not a framework to solve anything in n-bits right now.

    import satispy

    bits = 16

    vars_x = [satispy.Variable('x%i' % i) for i in range(bits)]

    #For this to work... we need to set v and w's constants.

    k = 32

    vars_k = [satispy.Variable('k%i' % i) for i in range(bits)]

    exp_k = satispy.Cnf()
    for i in range(bits):
        if k & (1<<i):
            exp_k &= vars_k[i]
            exp_k &= -vars_k[i]
        #end if
    #next i

    m = 99
    vars_m = [satispy.Variable('m%i' % i) for i in range(bits)]

    exp_m = satispy.Cnf()
    for i in range(bits):
        if m & (1<<i):
            exp_m &= vars_m[i]
            exp_m &= -vars_m[i]
        #end if
    #next i

    expected_output = 0x3334
    if len(sys.argv) > 1: expected_output = int(sys.argv[1])

    vars_cmp = [satispy.Variable('cmp%i' % i) for i in range(bits)]

    exp_cmp = satispy.Cnf()
    for i in range(bits):
        if expected_output & (1<<i):
            exp_cmp &= vars_cmp[i]
            exp_cmp &= -vars_cmp[i]
        #end if
    #next i

    exp_eq = satispy.Cnf()
    for i in range(bits):
        #v_exp = vars_x[i] ^ vars_k[i]
        #w_exp = v_exp & vars_m[i]
        #out_exp = w_exp ^ vars_x[i]
        # This line eats all memory. When I don't negate, it doesn't.
        #exp_eq &= -(out_exp ^ vars_cmp[i])
        exp_eq &= -((((vars_x[i] ^ vars_k[i]) & vars_m[i]) ^ vars_x[i]) ^ vars_cmp[i])
    #next i

    solution_cnf = exp_k & exp_m & exp_eq & exp_cmp

    io = satispy.io.DimacsCnf()
    open('easy_problem1a.cnf', 'w').write(io.tostring(solution_cnf))
    solver = satispy.solver.Minisat()
    res = solver.solve(solution_cnf)
    if res.success:
        print('\n'.join(['%s %r' % (v.name, res[v]) for v in vars_x]))
        x = int(''.join([str(int(res[v])) for v in vars_x][::-1]), 2)
        print('x =', x)
    elif res.error:
        print("Not satisfiable.")
    #end if

The first thing to note is that the constants k and m are variables and they have trivial expressions that are appended to the solution CNF. What we're doing is we're setting those constants such that everything is a variable. In the future we could use the fact that we know the value of these to solve equations in the CNF before it gets to the solver, but that's probably not going to be much faster than the solver (which is written in C and has been optimized).

The logic is done on the line below.

    exp_eq &= -((((vars_x[i] ^ vars_k[i]) & vars_m[i]) ^ vars_x[i]) ^ vars_cmp[i])

As you can see I've taken the equation

    0x3333 == ((x ^ 32) & 99) ^ x

and I've made it into a true/false and when it's true, we get x (the input).

    not ((((x ^ 32) & 99) ^ x) ^ 0x3333)

And done? No. At this point, I had an almost working piece of code. If you run the code with satispy from netom, you will quickly see why I made a few comments to the effect that SAT is difficult and takes hours upon hours to debug. The CNF blows up when you negate ((((a ^ b) & c) ^ a) ^ e). Blows up is a technical term for when a piece of code tries to eat gigabytes, terabytes, or exabytes of memory. So now we look at what's going on in the negate function in satispy.Cnf.

    class Cnf(object):
        def __init__(self):
            self.dis = []
        def create_from(cls, x):
            if isinstance(x, Variable):
                cnf = Cnf()
                cnf.dis = [frozenset([x])]
                return cnf
            elif isinstance(x, cls):
                return x
                raise Exception("Could not create a Cnf object from %s" % str(type(x)))
        def __and__(self, other):
            other = Cnf.create_from(other)
            result = Cnf()
            result.dis = self.dis + other.dis
            return result
        def __or__(self, other):
            other = Cnf.create_from(other)
            if len(self.dis) > 0 and len(other.dis) > 0:
                new_dis = []
                for d1, d2 in [(d1,d2) for d1 in self.dis for d2 in other.dis]:
                    d3 = d1 | d2
            elif len(self.dis) == 0:
                new_dis = other.dis
                new_dis = self.dis
            c = Cnf()
            c.dis = new_dis
            return c
        def __xor__(self, other):
            return (self | other) & (-self | -other)
        def __neg__(self):
            cnfs = []
            for d in self.dis:
                c = Cnf()
                for v in d:
            ret = cnfs.pop()
            for cnf in cnfs:
                ret |= cnf
            return ret
        def __rshift__(self, other): # implies
            return -self | other
        def __str__(self):
            ret = []
            for d in self.dis:
                ret.append(" | ".join(map(str,d)))
            return "(" + ") & (".join(ret) + ")"
        def __eq__(self, other):
            return self.dis == other.dis
        def __hash__(self):
            return hash(self.dis)

The __neg__ function loops over the clauses and then over each variable in the clause creating a new Cnf object with the inverted variable as a clause. It puts those Cnf objects into a list. Then it ors them together. Surprisingly, the for double loop was not the issue. It easily gets past that. The problem comes in the or function. Let's look at __or__ now.

    for d1, d2 in [(d1,d2) for d1 in self.dis for d2 in other.dis]:
        d3 = d1 | d2

That's the problem. How do we go about solving such a difficult problem though? It seems almost unsolvable from the perspective of how do I reduce these functions. The answer is to look at the logic. Take a look at this.

    import satispy
        satispy.cnf.cnfClass = satispy.NaiveCnf
    #end try
    a = satispy.Variable('a')
    b = satispy.Variable('b')
    c = satispy.Variable('c')
    d = (((a ^ b) & c) ^ a)


    (a | b) & (-a | a | -b) & (a | c) & (-a | a | -c) & (-a | b | -c) & (-a | a | -c | -b) & (-a | b | -c | -b)

As you can see ((a ^ b) & c) ^ a) produces a pretty small output. But look at the clauses for a moment.

    (a | b) &  # Fine
    (-a | a | -b) & # Not fine
    (a | c) & # Fine
    (-a | a | -c) & # Not fine
    (-a | b | -c) & # Fine
    (-a | a | -c | -b) &  # Not fine
    (-a | b | -c | -b)    # Not fine

Look closely at (-a | a | -b). Do you see any logical flaw here? -a | a is a worthless statement. You can just replace it with True. Thus we get (True | -b). But that is a worthless statement as well because True | x is always True. Allow me to illustrate for the non-programmers. Either I am a person or I am not a person or you are an automobile. It doesn't matter whether you are an automobile or not, because the statement has already said "I am a person" or "I am not a person" which is certain to be true. Therefore the entire statement can be removed. The way that CNF works is that you have clauses anded together. Each clause is a set of variables or negated variables or'ed together. This very strict regular format means that all logic has to be converted into this format. It also means that when you naively convert (like satispy does) you end up with a massive collection of unnecessary statements. I wrote a very small function called reduceCnf that filters out any clause that has (v | -v) in it. For example, running reduceCnf(d) on our d above gives us the following:

    '(a | b) & (a | c) & (-a | b | -c)'

As you can see a 7 clause CNF now becomes a 3 clause CNF. Using a modified negate and or that uses the reduceCnf function liberally, we are able to run the easy_problem1a.py very quickly. It actually makes sense that instead of writing negate and or naively, we should instead write them in such a way to never allow redundancy because of the memory problem. Anyone using satispy on a serious SAT problem has probably run into this (I can't imagine someone not running into this).

One might suggest that what I've done is very inefficient and should only be considered a workaround. You can see on this problem is wastes a tiny bit of time, but that could add up. But if you look at what I've done, I've actually only run this function where it absolutely has to be. There are improvements that could be made, but until someone writes them, this is the correct solution. I recommend that anyone trying to get more performance out of satispy take a look at improving the or and negate functions more than I have.

SAT is not nearly as difficult as it appears. I recommend spending a full day working with the tools and getting a virtual environment set up to safely run any SAT solver that you may need. When you have the environment, write a simple logical function in Python and see if you can solve it with the code.

Future work will of course involve writing an integer format and a constant format for satispy which will allow a user to write the following code:

    bits = 64

    var_x = Int64Variable('x')
    k = Constant64(32)
    m = Constant64(99)
    expected_output = Constant64(0x3334)

    exp_eq = -((((var_x ^ k) & m) ^ var_x) ^ expected_output)

    solution_cnf = k.exp & m.exp & exp_eq & expected_output.exp

    io = satispy.io.DimacsCnf()
    open('easy_problem1a.cnf', 'w').write(io.tostring(solution_cnf))
    solver = satispy.solver.Minisat()
    res = solver.solve(solution_cnf)
    if res.success:
        print('\n'.join(['%s %r' % (v.name, res[v]) for v in vars_x]))
        x = int(''.join([str(int(res[v])) for v in vars_x][::-1]), 2)
        print('x =', x)
    elif res.error:
        print("Not satisfiable.")
    #end if

I am very interested to hear what people have to say about my research. Feel free to e-mail me (please use PGP/GPG), tweet, or phone me. I am currently looking for an job or internship at a Nanotechnology research lab. Let me know if you can introduce me to someone in the field.

Off topic: I wrote this in Markdown and used pandoc to convert it to html. It looks nice, doesn't it? It also supports a lot more formats than HTML5. The markdown version is here.

Javantea out.

Version: GnuPG v2



Comments: 0

Leave a reply »

  • Leave a Reply
    Your gravatar
    Your Name