mstechly/vqf

Pre-processing of 11*13 reports wrong factorization

peendebak opened this issue · 3 comments

@mstechly When running the pre-processing part on p*q=11*13 the code reports a factorization of 15*9, which is incorrect. I created a minimal example:

true_p=11; true_q=13; use_true_values=True; m=true_p*true_q

apply_preprocessing = True
preprocessing_verbose = True
optimization_verbose = False
if use_true_values:
    p_dict, q_dict, z_dict, clauses = create_clauses(m, true_p, true_q, apply_preprocessing, preprocessing_verbose)
else:
    p_dict, q_dict, z_dict, clauses = create_clauses(m, None, None, apply_preprocessing, preprocessing_verbose)

number_of_uknowns, number_of_carry_bits = calculate_number_of_unknowns(p_dict, q_dict, z_dict)
print("Number of unknowns:", number_of_uknowns)
print("Number of carry bits:", number_of_carry_bits)
if clauses[0] == 0 and len(set(clauses)) == 1:
    if number_of_uknowns == 0:
        p, q = decode_solution(p_dict, q_dict)
        print(f'found solution! {p}, {q}')

Hi @peendebak ! Indeed, it looks like a bug, thanks for spotting it!

Here's my "braindump":

First, I've changed the order of p and q. It's symmetrical, so it shouldn't really matter, but I remember there was a convention somewhere that p>q, but it was so long ago that I'd prefer to stick to it and minimize the risk of it all being a stupid mistake.
So now:
p=1101 and q=1011

After simplifying the clauses in create_clauses is done, we have the following:

(Pdb) p_dict
{0: 1, 1: q_2, 2: 1 - q_2, 3: 1}
(Pdb) q_dict
{0: 1, 1: 1 - q_2, 2: q_2, 3: 1}

Which is correct + there is no way, it should give us numbers 15 (1111) and 9 (1001), as this leads to contradiction right away.

The problem occurs in simplify_symmetric_case function, which returns p_dict={0: 1, 1: 1, 2: 1, 3: 1} and q_dict={0: 1, 1: 0, 2: 0, 3: 1}.

And now I see what's the issue!
Take a look at this logic:

    for key in p_dict.keys():
        if type(p_dict[key]) != int or type(q_dict[key]) != int:
            if p_dict[key] + q_dict[key] == 1:
                p_dict[key] = 1
                q_dict[key] = 0

So when we iterate through the variables in p_dict, we set both p_1 and p_2 to 1.
After first pass we unfortunately lose the information that values for p_1=q_2 and p_2=1-q_2 are actually correlated. Instead of just assigning p_1=1 and q_1=0, we should do q_2=1 and then substitute it to the whole expression.

I fixed it in #4 , however I'm not sure if this solution is 100% correct as I have not tested it properly for other numbers yet, will try to do this next week.

It looks that #4 was working ok, so I merged it.
Let me know if you have any more thoughts on this @peendebak.

@mstechly For true_p=11; true_q=13; use_true_values=True; m=true_p*true_q the correct factorization is found! For use_true_values=False a set of equations is derived, but I have not verified whether they are correct. I'll close the issue. Thanks!