logic-and-learning-lab/Popper

Overfitting and wrong solution in 'find-dupl' example.

Closed this issue · 2 comments

viyx commented

Hello! Thanks a lot for your work!

I find a bug in find-dupl example where we should find duplicates in a list. At the beginning of the bias file we can see commented code of expected solution.

%% python3 popper.py examples/find-dupl --eval-timeout=0.01
%% f(A,B) :- head(A,B).
%% f(A,B) :- tail(A,D),tail(D,C),element(C,B),f(D,B).
%% 30.24s user 1.40s system 100% cpu 31.431 total

I got the same silution

My output:

(popper-pure) vii@Vitaliis-MacBook-Air Popper % python popper.py examples/find-dupl
13:07:44 SIZE: 1 MAX_SIZE: 40
13:07:44 SIZE: 2 MAX_SIZE: 40
13:07:44 ********************
13:07:44 New best hypothesis:
13:07:44 tp:3 fn:7 size:2
13:07:44 f(A,B):- head(A,B).
13:07:44 ********************
13:07:44 SIZE: 3 MAX_SIZE: 40
13:07:44 SIZE: 4 MAX_SIZE: 40
13:07:44 SIZE: 5 MAX_SIZE: 40
13:07:44 ********************
13:07:44 New best hypothesis:
13:07:44 tp:4 fn:6 size:7
13:07:44 f(A,B):- head(A,B).
13:07:44 f(A,B):- tail(A,D),tail(D,E),tail(E,C),head(C,B).
13:07:44 ********************
13:07:45 ********************
13:07:45 New best hypothesis:
13:07:45 tp:6 fn:4 size:12
13:07:45 f(A,B):- head(A,B).
13:07:45 f(A,B):- element(A,B),head(A,C),odd(C),geq(B,C).
13:07:45 f(A,B):- tail(A,D),tail(D,E),tail(E,C),head(C,B).
13:07:45 ********************
13:07:45 SIZE: 6 MAX_SIZE: 40
13:07:46 ********************
13:07:46 New best hypothesis:
13:07:46 tp:7 fn:3 size:18
13:07:46 f(A,B):- head(A,B).
13:07:46 f(A,B):- element(A,B),head(A,C),odd(C),geq(B,C).
13:07:46 f(A,B):- tail(A,D),tail(D,E),tail(E,C),head(C,B).
13:07:46 f(A,B):- tail(A,E),tail(E,F),tail(F,D),tail(D,C),head(C,B).
13:07:46 ********************
13:07:47 SIZE: 7 MAX_SIZE: 40
13:07:48 ********************
13:07:48 New best hypothesis:
13:07:48 tp:10 fn:0 size:7
13:07:48 f(A,B):- head(A,B).
13:07:48 f(A,B):- tail(A,C),tail(C,D),element(D,B),f(C,B).
13:07:48 ********************
********** SOLUTION **********
Precision:1.00 Recall:1.00 TP:10 FN:0 TN:10 FP:0 Size:7
f(A,B):- head(A,B).
f(A,B):- tail(A,C),tail(C,D),element(D,B),f(C,B).

But it is wrong solution, it takes a head of a list firstly.
For a example:

?- f([16,24,24],16).
true .

It turns out there are many positive examples where the head of a list is also a duplicate at the same time. So it is overfitted, all default positive and negative examples passed test phase. I just added another neg example (neg(f([84, 44, 44, 26, 44, 74, 96, 24, 79],84)).) and got valid solution:

********** SOLUTION **********
Precision:1.00 Recall:1.00 TP:11 FN:0 TN:11 FP:0 Size:7
f(A,B):- tail(A,C),element(C,B),head(A,B).
f(A,B):- tail(A,C),f(C,B).
viyx commented

Should I open pr?

Thanks for spotting! I just added that example and it works as expected, thanks!