UP B-SAT WITHOUT A PADDLE =============================================================================================
████ ████░░░░████ ██░░░░░░░░░░▓▓ ██░░░░░░▓▓██████ ██▒▒░░▓▓▒▒▒▒▒▒▒▒████░░░░░░░░░░ ██░░▒▒██▒▒▒▒▒▒▒▒▒▒▒▒██░░░░░░░░░░░░░░░░░░ ██▒▒░░██▒▒▒▒▒▒▒▒▒▒▒▒▒▒██░░░░░░░░░░░░░░░░░░ ██▒▒ ██▒▒▒▒▒▒▒▒▒▒▒▒▒▒██░░░░░░░░░░░░░░░░░░ ██▒▒░░██▒▒▒▒▒▒▒▒▒▒▒▒▒▒▓▓██░░░░░░░░░░░░░░░░░░ ░░██▒▒░░██▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒██░░░░░░░░░░░░░░░░ ░░░░██▒▒▒▒░░██▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒██░░░░░░░░░░░░ ░░&&░░██▒▒▒▒░░▓▓▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▓▓░░░░░░░░░░ ░░&&░░░░██▒▒▒▒░░██▒▒▒▒▒▒▒▒▒▒████▒▒████░░░░░░░░ ░░&&░░░░░░██▒▒▒▒░░██▒▒▒▒▒▒██▓▓▓▓████▒▒██░░░░░░ ░░&&░░░░░░░░██▒▒▒▒░░██▒▒██▓▓▓▓▓▓▓▓▓▓██▒▒██░░░░░░░░░░░░░░░░ ░░░░&&░░░░░░░░██▒▒░░██▒▒██▓▓▓▓▓▓▓▓▓▓▓▓████████▒▒░░░░░░░░░░░░ ░░░░░░&&&&░░░░░░██▒▒░░██▒▒██▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓██▒▒░░░░░░░░░░░░ ░░░░░░░░░░&&░░░░░░██▒▒░░████▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓██▒▒░░░░░░░░░░░░ ░░░░░░░░░░░░&&░░░░░░██▒▒░░██▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓██░░░░░░░░░░░░ ░░░░░░░░░░░░░░&&░░░░░░██▒▒░░▓▓▓▓██████████████▒▒██░░░░░░░░░░░░░░░░ ░░░░░░░░░░░░░░&&░░░░░░░░██▒▒░░██▒▒▒▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓██░░░░░░░░░░░░░░░░ ░░░░░░░░░░░░░░&&░░░░░░░░░░██▒▒░░████░░██████████▓▓▓▓████░░░░░░░░░░░░░░░░░░░░░░ ░░░░░░░░░░░░&&░░░░░░░░░░░░██▒▒░░██ ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓██▓▓██░░░░░░&&░░░░░░░░░░░░░░░░ ░░░░░░░░░░░░&&░░░░░░░░░░░░░░██▒▒░░██▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓██▓▓██░░░░░░((&|!)░░░░░░░░░░░░ ░░░░░░░░░░░░&&░░░░░░░░!!░░░░░░██▒▒░░▓▓░░██▓▓▓▓▓▓▓▓▓▓▓▓██▓▓▓▓▓▓▓▓▓▓░░░░&&░░||||||░░░░░░ ░░░░░░░░░░&&░░░░░░░░!!░░░░░░░░██▒▒░░████▒▒██████████▓▓▓▓▓▓▓▓▓▓▓▓██░░░░░░░░░░░░!!░░░░░░ ░░░░░░░░░░░░&&░░░░░░░░!!░░░░░░░░██▒▒░░▒▒██▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓██▒▒██░░░░░░░░░░░░░░░░░░░░ ░░░░░░░░░░&|&░░░░░░░░░!!░░░░░░░░▓▓▒▒▒▒████▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓██▒▒▒▒▒▒▓▓░░░░░░░░░░░░░░░░░░ ░░░░░░░░░░░░&&░░░░░░░░░░!!░░░░░░░░██▒▒██░░██████▓▓▓▓▓▓▓▓██▒▒▒▒▒▒▒▒▒▒████░░░░░░░░░░░░░░ ░░░░░░░░░░░░░░(&░░░░░░░░!!░░░░░░░░██▒▒██▒▒░░██▒▒████████░░▒▒▒▒▒▒▒▒▒▒▒▒▒▒██░░░░░░░░░░░░░░ ░░░░!!░░░░░░░░&)░░░░░░░░!!░░░░░░░░░░████▒▒▒▒░░██▒▒▒▒▒▒░░▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒██░░░░░░░░░░░░ ░░░░░░!!░░░░░░░░&&░░░░░░░░░░░░░░░░░░░░░░██▒▒▒▒░░████░░▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒██░░░░░░░░░░ ░░░░!(░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░██▒▒▒▒░░░░██▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒██░░░░░░░░ ░░░░░░|)░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░████▒▒▒▒░░██▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒██░░░░░░ ░░░░░░&&░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░██▒▒▒▒░░████▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒██░░░░░░░░ ░░░░░░!!░░░░░░░░░░░░░░░░||░░░░░░░░░░░░░░░░░░░░░░██▒▒▒▒░░░░██▒▒▒▒▒▒▒▒▒▒████▒▒██░░░░░░░░ ░░░░))░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░██▒▒▒▒▒▒░░██▒▒▒▒████▒▒▒▒░░░░████░░░░ ░░░░((░░░░░░░░░░░░░░░░░░░░&||&░░░░░░░░░░░░░░░░░░░░░░████▒▒▒▒░░████▒▒▒▒░░░░░░░░░░▒▒▓▓░░ ░░^^░░░░░░░░░░░░░░&&░░░░░░░░&&░░░░░░░░░░░░░░░░░░░░░░████▒▒▒▒▒▒░░░░░░░░░░░░░░░░▒▒██ ░░░░||&&░░░░░░░░░░░░&&&&░░░░░░&&░░░░░░░░░░||||░░░░░░░░░░████▒▒▒▒░░░░░░░░░░▒▒████░░ ░░░░░░!!!!░░░░░░░░░░░░&&&&░░░░((()))░░░░░░░░||░░░░░░░░░░░░████▒▒▒▒░░▒▒████░░░░ ░░░░░░&&░░░░░░░░░░░░░░&&░░░░░░░░&&░░░░░░░░||░░░░░░░░░░░░░░████▒▒██░░░░░░░░ ░░░░░░░░░░░░░░░░░░░░&&░░░░░░&&░░░░░░░░||░░░░░░░░░░░░░░░░░░██░░░░░░░░░░ ░░░░░░░░░░░░░░░░&&░░░░░░&&░░░░░░░░||░░░░░░░░░░░░░░░░░░░░░░░░░░░░ ░░░░░░&&░░░░░░░░░░░░░░░░░░^^^^^^░░░░░░░░░░░░░░░░░░░░ ░░░░░░ ░░░░░░░░░░░░&&░░░░░░░░░░░░░░░░░░ ░░░░░░░░░░░░░░░░░░░░
Preface =============================================================================================
pre-preface: I wrote this a few years ago and once
finishing I read something which made me think the conclusion
I came to was "CNF vs DNF" format. I brushed it off as
"p vs np is a compression problem" and never uploaded this.
More of a backup than anything, but either way here it is.
See: FOLLOW_UP.md
I am a programmer and computer nerd, and not a math nerd. The words you are about to read will from a programming point of view. From a lax, joking, and standards hating pupil of the world :]
Rarely do I read math related content. I barely read papers written in official formats for any fields. Occasionally interest like biology or an exploit write-up warrants a read. Excuse any misunderstanings or lack of math-based expressions. Boolean will be in C-like syntax, until it isn't.
Assumptions will be made about the reader's understanding of boolean logic, programming, algeabra and complex correlation.
The boolean satisfiability problem makes some assumptions I believe, and hope to show, are misconstrued.
Boolean Basics =============================================================================================
At the heart of boolean math, you have 2 core functions. These are the AND && and OR || functions. For simplicity I will refer to them as & and |. There are other operators, such as NOT ! groups () and even XOR ^. These represent a simple binary switch, a 1 or 0, on or off.
Let's explore.
AND
a & b
If a
is true, and b
is true, the statement is true.
Simple enough. How about the | operator?
OR
a | b
If a
is true, or b
is true, the statement is true.
If you have programmed or seen the P vs NP problem, you are no doubt familiar with the basics.
From there we get more complexities, groupings, variables. But all of these will be clearly calculatable, just set said variables to true and you're done.
Here we come to the ! operator.
NOT
!a
NOT: meaning we flip the value of a
.
If it was 1, it is now 0. If 0, now it's 1.
It is easy to see if we set this to 0 then it will ultimately
pass this simple example.
The operation of ! is what makes the boolean satisfying difficult. Later we will massively 'expand' on this issue. But for now we will keep going forward.
If you have
a & !a
There is no way to solve this.
You can not have, and not have, a
. It is a single state.
LAZY
Before we get into operations, I would like to simplify
a couple more things. The first is important to the whole
argument being made. Everything, EVERYTHING, in boolean
math can be reduced to what we will call chains
or
and chains
.
This is a larger boolean example than those above.
a & b | c & ( d & e)
We haven't touched too much on expansion and compression yet, but every boolean problem, no matter the operations, can be expressed as sets of &s separated by |s. With or without !s, even ^s. This is key.
You have 2 actual problems here. Each side representing an & chain.
a & b
|
c & d & e
Going forward, and in code, the & will be implied.
ab | cde
is much easier to read without &. It
will also become important in the way code behaves.
Turing Solutions and Operations =============================================================================================
In progamming it is relatively easy to flip bits, move memory and reduce complexity of operations by using a binary format.
For example, I am running a x86_64 ISA, 64bit amd64 based CPU. The 64 bits used for data can be taken advantage of to reduce operations. A lookup-table referring to bits can be used to set values.
e.g.
// ( = 0x28 hex = 40 decimal
lookup_table[255] = { 0 };
...
lookup_table[33] = 8; // 00001000 binary, ! points here
...
lookup_table[40] = 2; // 00000010 binary, ( points here
lookup_table[41] = 1; // 00000001 binary, ) points here
...
lookup_table[124] = 4; // 00000100 binary, | points here
...
t_master[j] |= lookup_table[(unsigned char)input[i]];
By doing this you can assign a bit value 1, 2, 4, 8...n
per token. This limits you to only 64 tokens, since
we have a 64bit machine. It does however, allow for
quick instructions like the |=
which will add
the bit value to a variable. Unfortunately we do not
have "infinite-bit" computers, but this method is
used in computing today.
Now that we've established how we will be using values. We will do a check for more complex & ! | chains. We will limit the bits to 8 so examples are managable.
For a chain of abc
the bits in memory would look like
11100000
abc_____
Back to our simple ! example
a!a
The unsolvable a
but !a
. This requires 2 variables,
pos[itive] and neg[ative], for inclusions and exclusions.
pos = 10000000
neg = 10000000
There is a conflict here, both can't be true. Easy enough to test for: & pos and neg, if your value is non-0, there is a conflict and this can not be true.
1000 & 1000 = 1000
1000 != 0
So far the examples have all been a few operations per-input.
But I would like to ask this before we go any further.
Let's say your friend sends you a file. But you don't know what this file is. So you ask...
yyy: What's this file?
fff: It's 10MB
yyy: I see that, but I can't open it
fff: It's compressed. You have to unzip it
yyy: Oh, so it's not 10MB?
fff: No, it's 10MB
You spend time figuring out what the file is: tar, zip, 7z, rar - you decompress it. Now it's a 100MB file.
Did your friend send you a 10MB file? Or a 100MB file?
You surely can't use the 10MB version. It's only when you expand and unravel the encoded data that you have what something you recognize.
Clearly, I side with the 100MB file.
The reduction of expressions has been shown by various people in various fields, but the expansion is what I'm concerned with. I do not believe reduced problems adequately represent their size; nor present the amount of operations required obviously to the observer. "This is my thesis statement." [Planes, Trains, and Plantains]
With math we recognize what's going on without reformatting the original problem. Whether it's Σ, some other loop, function, or what we will be using here: distributive property.
Expand/Distribute/Condense/Compress/Simplify =============================================================================================
As previously stated, all boolean math can be expanded, with the distributive property, to chains of & statements separated by | statements. This is contrast to the commonly used CNF format, which uses | chains separated by & statements. Also, as previously stated, the only reason we're working these problems at all is because they have potential for a conflicting ! value in them.
The distribution of a & operator cleans up string syntax for parsing issues. As shown previously, this is also useful because you can set the bit for a given token in less space/variables.
a(b|c)
Becomes
(ab) | (ac)
Note there are now only 2 & chains instead of 3. The
ab
could be represented in memory as 11000000
,
and the ac
as 10100000
.
This is much like the distributive property for basic algebra.
4(1 + 4)
Becomes
(1 + 1 + 1 + 1 + 4 + 4 + 4 + 4)
Of course, we would not add, but instead multiply, because we know the shortcut.
Here is a larger problem.
abc((d|be)|c)(((bc|ae)d)ad|c)a(bd|dc)
Without going step-by-step, the end result looks like
(abcd|abcbe|abcc)(abcbcdada|abcaedada|ca)(bda|dca)
This is not a & chain separated by | yet though. So we continue.
(abcd|xxx|xxx)(abcbcdada|xxx|xxx)(bda|xxx)
(abcd|xxx|xxx)(abcbcdada|xxx|xxx)(xxx|dca)
(abcd|xxx|xxx)(xxx|abcaedada|xxx)(bda|xxx)
...
(xxx|xxx|abcc)(xxx|xxx|ca)(xxx|dca)
And the real problem at the end of it
single line
abcdabcbcdadabda | abcdabcbcdadadca | abcdabcaedadabda|abcdabcaedadadca | abcdcabda | abcdcadca | abcbeabcbcdadabda | abcbeabcbcdadaca | abcbeabcaedadabda | abcbeabcaedadadca | abcbecabda | abcbecadca | abccabcbcdadabda | abccabcbcdadadca | abccabcaedadabda | abccabcaedadadca | abcccabda | abcccadca
multi line
abcd abcbcdada bda
abcd abcbcdada dca
abcd abcaedada bda
abcd abcaedada dca
abcd ca bda
abcd ca dca
abcbe abcbcdada bda
abcbe abcbcdada ca
abcbe abcaedada bda
abcbe abcaedada dca
abcbe ca bda
abcbe ca dca
abcc abcbcdada bda
abcc abcbcdada dca
abcc abcaedada bda
abcc abcaedada dca
abcc ca bda
abcc ca dca
The total of actual & chains can be found by
(||) (||) (|)
2+1 2+1 1+1
3 * 3 * 2 = 18
The chains outside of the | inside each () signify what's really going on.
This is not a single 38 boolean input problem. This is a 18 set of & chains separated by | operators, 267 input problem. Though we look at it differently in our minds, rationalizing and compressing/decompressing data to simplify our understanding - this is a way more complex problem than it first appeared.
This example still lacks !s that are required to
qualify as unsolveable. But these tokens could
be anything. Simplified representations of !s
that get translated to a x
e.g. x = !a
, or
an enitre set of variables abc
so that you only
have to check 1 bit instead of 3. But this is all
logic we create. Not all transmutations are
efficient for every problem, but they are doable.
When a computer compiles code, we introduce our
own condensing and logic so that it produces
efficient and fast results: transmutation. In
code, the if-statements we write become cmp
,
jne
, and je
instructions. Constantly
checking your cmp
and jmp
in-line while
checking has 2 issues: first, it adds operations
while running; second, some of those operations
may leave the code flow, which does not give us
"worst case" per a run. Other algorithms use such
techniques. I would say they are not suited for
proof-of-concept code.
I want to get into these algorithms, but you may have wondered why ! examples have been so elusive up until now. I suppose it's time I tell you why :[
NOT Hell
NOT, ! - This may be the largest reason B-SAT distribution is so difficult.
It seems innocent at first, just !a
, a pos and neg 64bit
variable, & them, no issue. But when you add the distributive
property...
Take :
!( (bc|ae) (a|b) )
expand ! to each side of either | in (|)(|)
( (!b!c|!a!e) (!a|!b) )
However...
(!b!c|!a!e)
isn't quiet the same as !(bc|ae)
.
In the orignal if you had bc
or ae
the statement
could be true, but 1 of each could be wrong.
(!b|c)(!a|!e)
follows the same logic. This means !
flips & and | symbols when expanded.
Following through gives us :
( (!b|!c)(!a|!e) | (!a!b) )
This means 3 | operators - compared to 2 before.
So instead of what appeared to be 4 branches of | chains
(|) (|)
1+1 1+1
2 * 2 = 4
There are over 4x more branches to solve the original !() - This ! drastically reduced the actual size of this problem.
(|) (|) (|)
2+1 2+1 1+1
3 * 3 * 2 = 18
Another example :
!( !(bc|ae) d (a|b) )
Even worse - double !s
Expand non ! :
!( !(dbc|dae) (da|db) )
This becomes...
!( (!d|!b|!c)(!d|!a|!e) (da|db) )
then
( !(!d|!b|!c) | !(!d|!a|!e) | !(da|db) )
then further
( !(!d|!b|!c) | !(!d|!a|!e) | ((!d|!a)(!d|!b)) )
Now !(!x) is a double negative. Which means all of the | and ! must go away. If any one of them is false, the whole expression becomes true
and that's NOT what you want
(dbc)|(dae)
Well, wait, that isn't right. Originally we had
!( !(dbc|dae) ...)
so if EITHER dbc
OR dae
became true, we ! that true we fail the inside
chain, which the oustside !false passes - we have
to add the | back in
( (dbc|dae) (!d|!a)(!d|!b) )
but even !( !(dbc|dae) (da|db) )
is wrong
! expansion order
In our original !( !(bc|ae) d (a|b) )
if d
was false, the outside !() would become true.
So the order of expansion matters too.
Trying again...
Start with : !( !(bc|ae) d (a|b) )
Expand outside ! : ( (bc|ae) | !d | !(a|b) )
The inside double !(!()) cancels out and the
implied &s become |s
Expand the newly added !() : ( (bc|ae) | !d | (!a!b) )
Is this right?
compare ! distribution
!( !(bc|ae) d (a|b) )
So if bc
were true, we would ! it.
The inside !(true) would become false, failing the rest
we would !false, and be true.
If bc
OR ae
were not true, say !c
and !e
we would have a !false: true. !true: false.
Move to d
, if that were true, !true: false.
Move to (a|b)
, say either are true
we end up with !true - and we fail overall.
Again:
If bc
or ae
chains were true...we're fine.
If d
is false...we're fine.
If both in the a|b
set fail...we're fine.
All of these conditions seem to be true for :
( (bc|ae) | !d | (!a!b) )
and
!( !(bc|ae) d (a|b) )
The original had 4 total (|) paths. The real representation happens to have 4 as well.
But, you can see in a simpler example that we could easily increase our total | count.
!(abcdefg)
becomes !a|!b|!c|!d|!e|f|!g
enough of NOT
It would seem ! hides and adds most of the complexity behind statements. & become |, | become &, includes become excludes. !! doubles up and wreak inside ()
Algebra expansion is less thought provoking
4(5*1+1)
We wouldn't multiply 5 by itself 4 times. But the () does mean we multiply 5 by 4 once in this case.
The only "positive" thing to be said, is parsing is a bit easier when you can flip & to |, and | to &. But the actions involved with !() expansion imply a lot of complexity in a single token.
! is the bane of boolean logic. It changes entire regions into | segments.
XOR
I won't go into much on XOR ^, but similar to NOT ! there are hidden expansions and compressed problems.
a^b
expands to (a && !b) | (!a && b)
And further down the line
!( !(a^b | c^a) d^e)
!( !( (a!b | !ab) | (c!a | !ca) ) (d!e|!de) )
!( (!aba!b) (!cac!a) | !ded!e)
( !(!aba!b) | !(!cac!a) !(!ded!e) )
( (a|!b|!a|b) | (c|!a|!c|a) (d|!e|!d|e) )
( (a|!b|!a|b|c|!a|!c|a) (d|!e|!d|e) )
We went from 2 | paths to 10.
Of course in this example we could reduce more.
There are duplicates and counters like !a|a
But it is another very compressed represenation of what is actually going on under the hood. Even executing a single ^ is a not a simple task.
Algorithmic Basic BSAT Solver
Given an & chain, e.g. abcde!f
, the solution
is a simple loop to perform a few operations
per input.
do_bsat(...)
...
register int i;
register unsigned long int pos = 0;
register unsigned long int neg = 0;
register unsigned char cur = 0;
for (i=0; i < strlen(b); i++) {
cur = (unsigned char)b[i];
switch(cur) {
case '!':
cur = (unsigned char)b[++i];
neg |= lookup[dec(cur)];
break;
default:
cur = (unsigned char)b[i];
pos |= lookup[dec(cur)];
}
} // for strlen(b)
return pos & neg;
Algorithmic Expansion - untie NOTs
NOT ! distribution should be done first to avoid any logic from backwards distributed values and complexity. This makes the code easier to understand for humans too :]
for (i=0; i < in_sz ; i++) {
cur_tok = in[i];
switch (cur_tok) {
case ' ': continue;
case '!':
cur_tok = in[++i]; // look ahead
if (cur_tok == '(') { // !()
l_pos++;
not_layer++;
} else // !x
ADD(dec('!') );
break;
case '(':
l_pos++;
break;
case ')':
l_pos--;
if (l_pos < not_layer)
not_layer--;
break;
case '|':
i++; // skip expected next |
if (is_notting) continue;
else or_tot++;
break;
case '&': // only used with !() dist
i++; // skip expected next &
if (is_notting) {
cur_tok = dec('|');
or_tot++;
break;
}
continue;
default: // cur_tok > 16
if (is_notting) ADD(dec('!') );
break;
} // switch(cur_tok)
ADD(cur_tok);
} // for strlen(in)
Algorithmic Expansion - distribute x()x
void do_dist(chain *or, p_tree p, int e) {
int s;
for (s=p.start; s <= e; s++) {
or[s].pos |= p.pos_dist;
or[s].neg |= p.neg_dist;
}
p.pos_dist = 0;
p.neg_dist = 0;
} // void do_dist
...
while (in[i] != 0x0) {
cur_tok = in[i];
switch (cur_tok) {
case 1: // (
p_layers[l_pos].start = chain_pos;
p_layers[l_pos].pos_dist |= cur_p;
p_layers[l_pos].neg_dist |= cur_n;
cur_p = cur_n = is_disting = 0;
l_pos++;
break;
case 2: // )
if (is_disting == 1) { // NOT (x|x)
p_layers[l_pos].pos_dist |= cur_p;
p_layers[l_pos].neg_dist |= cur_n;
} else {
or_ch[chain_pos].pos |= cur_p;
or_ch[chain_pos].neg |= cur_n;
chain_pos++;
}
is_disting = 1;
cur_p = cur_n = 0;
l_pos--;
if (l_pos == 0) { // back out to layer 0
csz_pos++;
is_disting = 0; // ok bc ( assumes disting
} else {
do_dist(or_ch, p_layers[l_pos], chain_pos);
}
break;
case 4: // |
if (l_pos > 0) // need to add layer 0
c_szs[csz_pos]++;
if (is_disting == 1) {
p_layers[l_pos].pos_dist |= cur_p;
p_layers[l_pos].neg_dist |= cur_n;
do_dist(or_ch, p_layers[l_pos], chain_pos);
} else {
or_ch[chain_pos].pos |= cur_p;
or_ch[chain_pos].neg |= cur_n;
chain_pos++;
}
cur_p = cur_n = is_disting = 0; // has to be x|x
p_layers[l_pos].start = chain_pos + 1;
break;
case 8: // !
// these should only be !x now
cur_tok = in[++i];
cur_n |= cur_tok;
break;
default: // cur_tok > 16
cur_p |= cur_tok;
break;
} // switch(cur_tok)
i++;
} // while strlen(in)
...
// wrap up code excluded in doc
// do_dist once more for layer 0
...
Algorithmic OR chain to AND chain conversion
A recursive function with the right multi-array data can work through the hidden complexities of the | chains we've determined.
chain expand(...)
{
for (n=0 ; n < cur_sz ; n++) {
cur_ch.pos = or_ch[or_start + n].pos | dist.pos;
cur_ch.neg = or_ch[or_start + n].neg | dist.neg;
if (csz_idx < c_szs[0]) { // max or-block (||)(||)(|)
chain tmp_ch = expand(zero_solves, or_ch, c_szs, or_start + cur_sz, csz_idx + 1, cur_ch);
cur_ch.pos |= tmp_ch.pos;
cur_ch.neg |= tmp_ch.neg;
} else {
if ( (cur_ch.pos | dist.pos) & (cur_ch.neg | dist.neg) ) {
// use case here
}
}
} // for n < cur_sz
} // func
Gate Logic and Problem Solving
Back to the basics on boolean operations. These are hardware circuits that allow electricity to flow.
a && b a || b a ^ b
1 op 1 op 1 op
a--\ a--\ a--\
---if both ---if either ---if one
b--/ b--/ b--/
for all there are 2 inputs with 1 output
0 0 = 0 0 0 = 0 0 0 = 0
0 1 = 0 0 1 = 1 0 1 = 1
1 0 = 0 1 0 = 1 1 0 = 1
1 1 = 1 1 1 = 1 1 1 = 0
1 true condition 3 true conditions 2 true conditions
There are different outcomes for diferent operations. Only the & is true under a single condition. This makes finding true boolean statements for & chains easy to do. The | and ^ on the other hand have multiple states they could have come from, and only a single value telling you 1 of multiple states happened.
Let's solve some algebra problems to explain why these multiple answers are an issue.
x * y = ?
How do you come up with an answer? You don't have enough information. You know there are some variables and a multiplicaiton operator.
x ? y = ?
No way to come up with an answer, you don't even know the operator.
x * y = 100
Still fairly hard to deduce, a few answers come to mind, now you may be able to guess values.
4 * y = 100
Now we have enough information to come up with
a solution for y
.
Similarly, with boolean |
a | b = ?
No way to come up with an answer
a | b = 0
Must mean both a
and b
are 0, easy to solve
However
a | b = 1
There are 3 possible answers, and even if we are given a single value
1 | b = 1
There are 2 correct states left
Alternatively: & suffers the same problem when being solved for false.
Even by cheating, like in the above algorithms, and reducing the | to an ^ for 2 total possible answers instead of 3, we can not tell which is the case for an outcome of true.
This MUST be guessed, and given the unknowns it is impossible to be certain the first guess.
Arguments and Conclusion
It can be shown with the Basic BSAT
solver that
simple & chains are solvable in Polynomial-Time.
It can be shown with logic gates, that | has multiple true conditions, while & has multiple false conditions.
If the multiple | conditions are computed by distribution, into & chains such that the solution is true, then the problem can be solved in Polynomial-Time.
This table shows the exponential growth of problems being solved, in both: basic form, with expanded & chains as input; distributive form with reduced input.
input size | basic_solve | distributed_solve | basic_instruction_count | dist_instruction_count |
---|---|---|---|---|
518400 | 0.040417 | 0.005921 | 207373123 | 11781296 |
388800 | 0.032709 | 0.005744 | 139141279 | 13133453 |
155520 | 0.022558 | 0.002048 | 51339896 | 5432433 |
77760 | 0.005535 | 0.001331 | 25867540 | 4576715 |
38880 | 0.002473 | 0.000469 | 12321431 | 1581202 |
29170 | 0.001982 | 0.000175 | 9208968 | 1474001 |
19440 | 0.001473 | 0.000124 | 6178766 | 1366859 |
9720 | 0.00054 | 0.000185 | 3067639 | 616964 |
6480 | 0.000354 | 0.000162 | 2113132 | 581251 |
3240 | 0.000044 | 0.000034 | 1130063 | 330407 |
[ github: xlsx files should be with this markdown file. more data on function calls included ]
The slow down from distributed to basic can
be attributed 2 factors: strtok
and input
size. The strtok
operation is used to read
lines from a buffer. These lines are bigger in
memory, as the distributive solution has these
longer string values reduced to a single 8 byte
value no matter the size. The 8 byte value is
computed once and used thenceforth. This results
in the distributive solution having a reduced
time to solve and instruction count. If the
basic solver was given pre-binary-converted
input, it would be: read 2 8-sized bytes, & 1.
This would take considerably less time, but the
growth of time-per-input would remain steady.
Solving for a simple & only expression takes relatively short time. The introduction of | complicates things by reducing what is actually multiple possibilities of & statements to a easier to read format for us, the user.
The placement and number of |s in a problem can make the difference between an end list of 5000 & chains or 500,000.
The amount of solvable input follows the time to complete a given Boolean SATisfiable test.
Back to the question of the friend who gave you a 10MB file. Someone I asked this question to responsed with, "I would say you have a 10MB file, but 100MB of data." I believe this is an accurate way to represent the problem at hand. The input may be 100 tokens of boolean expressions, but the data it represents could be in the 1000s or more. An &(|) statement is as much a reduction as 2^2 meaning 22, or 2^4 meaning 2222. The big difference being, while arithmetic gives infinite results - boolean does not. There is a single result, with 2 possile values, from a myriad of input. The common CNF format is | chains separated by &s, which is still a heavily reduced set of problems.