vprover/vampire

Question answering

stanmoon opened this issue · 10 comments

Hi!

I asked Vampire to find the intersection of a line and a circle, the model is below.
I don't know how to retrieve the answer.
Tried different combinations of command line options.
This combination is the only one that finds a refutation.

Thanks for your help,

Stan

% define predicate circle type
% x,y,xcenter,ycenter,radius
tff(
  circletype,
  type,
  circle : $real * $real * $real * $real * $real > $o).

% define predicate line type
% x,y,xpoint,ypoint,slope
tff(
  circletype,
  type,
  line : $real * $real * $real * $real * $real > $o).

% define square function from real to real numbers
tff(
  squaretype,
  type,
  square : $real > $real).

% define a function to assign a symbol constant to a point
tff(
  pointname,
  type,
  point : $real * $real > locus).

% define a constant symbol to represent our circle
% we define it of type locus
tff(
  circlename,
  type,
  c : locus).

% define a constant symbol to represent the center of the circle
% we define it of type locus
tff(
  centername,
  type,
  center : locus).

% define a constant to represent the radius of our circle
tff(
  radiusname,
  type,
  r : $real).

% define a constant symbol to represent our line
tff(
  linename,
  type,
  l : locus).

% define a constant to represent the point for generating the line
tff(
  pointlinename,
  type,
  p : locus).

% define a constant to represent the slope of the line
tff(
  slopename,
  type,
  m : $real).

% constraint the center of our circle
tff(
  centeraxiom,
  axiom,
  point(-1.0,1.0) = center).

% constraint the radius of our circle to be the square root of 5
tff(
  radiusaxiom,
  axiom,
  square(r) = 5.0).

% constraint a point of our line
tff(
  linepointaxiom,
  axiom,
  point(1.0,2.0) = p).

% constraint the slope of our line
tff(
  slopeaxiom,
  axiom,
  m = 0.5).

%predicate with circle definition
tff(
  circleaxiom,
  axiom,
  ![X:$real,Y:$real,XC:$real,YC:$real,R:$real] : (
    circle(X,Y,XC,YC,R) <=> (
      $greater(R,0.0) &
      $sum(square($difference(X,XC)),square($difference(Y,YC))) = square(R)
    )
  )
).

%predicate with line definition
tff(
  circleaxiom,
  axiom,
  ![X:$real,Y:$real,XP:$real,YP:$real,M:$real] : (
    line(X,Y,XP,YP,M) <=> (
      $difference(Y,YP) = $product(M,$difference(X,XP))
    )
  )
).

tff(
  intersectquestion,
  question,
  ?[X:$real,Y:$real] : (
    (
      ?[XC:$real,YC:$real,R:$real] : (
        circle(X,Y,XC,YC,R) & point(XC,YC) = center & square(r) = square(R)
      )
    ) &
    (
      ?[XP:$real,YP:$real,M:$real] : (
        line(X,Y,XP,YP,M) & point(XP,YP) = p & M = m
      )
    )
  )
).

The command line parameters I use are as follows:
"--mode casc",
"--cores 0",
"--time_limit 600",
"--avatar off",
"--question_answering answer_literal"

Vampire output is as follows:

WARNING Broken Constraint: if avatar_nonsplittable_components(none) has been set then avatar(off) is equal to on
WARNING Broken Constraint: if avatar_nonsplittable_components(none) has been set then avatar(off) is equal to on
WARNING Broken Constraint: if avatar_congruence_closure(on) has been set then avatar(off) is equal to on
WARNING Broken Constraint: if split_at_activation(on) has been set then avatar(off) is equal to on
WARNING Broken Constraint: if avatar_delete_deactivated(off) has been set then avatar(off) is equal to on
WARNING Broken Constraint: if global_subsumption_avatar_assumptions(from_current) has been set then avatar(off) is equal to on
WARNING Broken Constraint: if global_subsumption_avatar_assumptions(full_model) has been set then avatar(off) is equal to on
WARNING Broken Constraint: if avatar_flush_period(100000) has been set then avatar(off) is equal to on
WARNING Broken Constraint: if avatar_flush_period(100000) has been set then avatar(off) is equal to on
WARNING Broken Constraint: if avatar_minimize_model(off) has been set then avatar(off) is equal to on
WARNING Broken Constraint: if avatar_flush_quotient(1.1) has been set then avatar(off) is equal to on
WARNING Broken Constraint: if avatar_flush_quotient(1.1) has been set then avatar(off) is equal to on
WARNING Broken Constraint: if avatar_fast_restart(on) has been set then avatar(off) is equal to on
% (47425)dis+11_4_afp=100000:afq=1.1:anc=none:cond=on:gs=on:gsaa=full_model:nm=64:nwc=1:sac=on:sp=reverse_arity:thi=all_2 on intersect for (2ds/0Mi)
% (47426)ott+1010_2:1_awrs=decay:awrsf=512:acc=on:add=off:afr=on:afp=100000:afq=1.1:amm=off:anc=none:bd=off:ccuc=first:fsr=off:fde=unused:gsp=on:gs=on:gsaa=from_current:irw=on:nm=32:newcnf=on:nwc=1:sos=theory:sp=occurrence:tha=some:uwa=interpreted_only:updr=off_8 on intersect for (8ds/0Mi)
% (47425)Time limit reached!
% (47425)------------------------------
% (47425)Version: Vampire 4.7 (commit 807e37dd9 on 2022-08-23 09:55:27 +0200)
% (47425)Linked with Z3 4.8.13.0 f03d756e086f81f2596157241e0decfb1c982299 z3-4.8.4-5390-gf03d756e0
% (47425)Termination reason: Time limit
% (47425)Termination phase: Saturation

% (47425)Memory used [KB]: 12153
% (47425)Time elapsed: 0.300 s
% (47425)------------------------------
% (47425)------------------------------
WARNING Broken Constraint: if avatar_nonsplittable_components(none) has been set then avatar(off) is equal to on
WARNING Broken Constraint: if sat_solver(z3) has been set then avatar(off) is equal to on
WARNING Broken Constraint: if avatar_delete_deactivated(off) has been set then avatar(off) is equal to on
WARNING Broken Constraint: if avatar_flush_period(40000) has been set then avatar(off) is equal to on
WARNING Broken Constraint: if avatar_add_complementary(none) has been set then avatar(off) is equal to on
WARNING Broken Constraint: if avatar_minimize_model(sco) has been set then avatar(off) is equal to on
WARNING Broken Constraint: if avatar_flush_quotient(1) has been set then avatar(off) is equal to on
% (47427)lrs+1010_4:1_aac=none:add=off:afp=40000:afq=1.0:amm=sco:anc=none:bd=off:cond=on:gs=on:gsem=on:irw=on:nm=0:nwc=2.5:sas=z3:stl=30:sos=theory:sp=reverse_arity:updr=off_2 on intersect for (2ds/0Mi)
% (47427)Time limit reached!
% (47427)------------------------------
% (47427)Version: Vampire 4.7 (commit 807e37dd9 on 2022-08-23 09:55:27 +0200)
% (47427)Linked with Z3 4.8.13.0 f03d756e086f81f2596157241e0decfb1c982299 z3-4.8.4-5390-gf03d756e0
% (47427)Termination reason: Time limit
% (47427)Termination phase: Saturation

% (47427)Memory used [KB]: 11641
% (47427)Time elapsed: 0.300 s
% (47427)------------------------------
% (47427)------------------------------
WARNING Broken Constraint: if avatar_nonsplittable_components(none) has been set then avatar(off) is equal to on
WARNING Broken Constraint: if lrs_weight_limit_only(on) has been set then saturation_algorithm(discount) is equal to lrs
WARNING Broken Constraint: if avatar_delete_deactivated(off) has been set then avatar(off) is equal to on
WARNING Broken Constraint: if avatar_flush_period(4000) has been set then avatar(off) is equal to on
WARNING Broken Constraint: if avatar_flush_quotient(1.1) has been set then avatar(off) is equal to on
% (47432)dis+10_5_add=off:afp=4000:afq=1.1:anc=none:cond=fast:ep=RSTC:fsr=off:gs=on:gsem=on:lwlo=on:nm=64:nwc=1:sp=reverse_arity:thi=all_3 on intersect for (3ds/0Mi)
% (47432)First to succeed.
% (47432)Refutation found. Thanks to Tanya!
% SZS status Theorem for intersect
% SZS output start Proof for intersect
tff(type_def_5, type, locus: $tType).
tff(func_def_0, type, square: ($real) > $real).
tff(func_def_1, type, point: ($real * $real) > locus).
tff(func_def_2, type, c: locus).
tff(func_def_3, type, center: locus).
tff(func_def_4, type, r: $real).
tff(func_def_5, type, l: locus).
tff(func_def_6, type, p: locus).
tff(func_def_7, type, m: $real).
tff(func_def_24, type, '$inst3': $real).
tff(func_def_25, type, '$inst4': $real).
tff(func_def_42, type, '$inst5': $real).
tff(func_def_43, type, '$inst6': $real).
tff(pred_def_1, type, circle: ($real * $real * $real * $real * $real) > $o).
tff(pred_def_2, type, line: ($real * $real * $real * $real * $real) > $o).
tff(pred_def_5, type, ans0: ($real * $real) > $o).
tff(pred_def_6, type, sQ1_eqProxy: ($real * $real) > $o).
tff(pred_def_7, type, sQ2_eqProxy: (locus * locus) > $o).
tff(f780,plain,(
  $false),
  inference(unit_resulting_resolution,[],[f779,f779,f778])).
tff(f778,plain,(
  ( ! [X0 : $real,X1 : $real] : (~ans0(X0,X1) | ans0(X0,X1)) )),
  inference(resolution,[],[f309,f82])).
tff(f82,plain,(
  ( ! [X0 : $real] : (sQ1_eqProxy(X0,X0)) )),
  inference(equality_proxy_axiom,[],[f47])).
tff(f47,plain,(
  ! [X0 : $real,X1 : $real] : (sQ1_eqProxy(X0,X1) <=> X0 = X1)),
  introduced(equality_proxy_definition,[new_symbols(naming,[sQ1_eqProxy])])).
tff(f309,plain,(
  ( ! [X2 : $real,X0 : $real,X1 : $real] : (~sQ1_eqProxy(X0,X1) | ~ans0(X0,X2) | ans0(X1,X2)) )),
  inference(resolution,[],[f78,f82])).
tff(f78,plain,(
  ( ! [X2 : $real,X3 : $real,X0 : $real,X1 : $real] : (~sQ1_eqProxy(X2,X3) | ~sQ1_eqProxy(X0,X1) | ~ans0(X0,X2) | ans0(X1,X3)) )),
  inference(equality_proxy_axiom,[],[f47])).
tff(f779,plain,(
  ( ! [X0 : $real,X1 : $real] : (ans0(X0,X1)) )),
  introduced(answer_literal,[])).
% SZS output end Proof for intersect
% (47432)------------------------------
% (47432)Version: Vampire 4.7 (commit 807e37dd9 on 2022-08-23 09:55:27 +0200)
% (47432)Linked with Z3 4.8.13.0 f03d756e086f81f2596157241e0decfb1c982299 z3-4.8.4-5390-gf03d756e0
% (47432)Termination reason: Refutation

% (47432)Memory used [KB]: 6140
% (47432)Time elapsed: 0.079 s
% (47432)------------------------------
% (47432)------------------------------
% (47424)Success in time 0.862 s```

Well, this is fun. The example is cool (and please let us know more about what you're doing more generally, we're interested!) but it can be reduced quite a bit.

fof(test, axiom, a = b).
fof(test, conjecture, ?[X]: p(X)).

with vampire -fde none -ep RSTC -av off -qa answer_literal gives

% Refutation found. Thanks to Tanya!
% SZS status Theorem for test
% SZS output start Proof for test
1. a = b [input]
6. a = b [cnf transformation 1]
7. ! [X0,X1] : (sQ1_eqProxy(X0,X1) <=> X0 = X1) [equality proxy definition]
8. sQ1_eqProxy(a,b) [equality proxy replacement 6,7]
10. ~sQ1_eqProxy(X0,X1) | ~ans0(X0) | ans0(X1) [equality proxy axiom 7]
20. ~ans0(a) | ans0(b) [resolution 10,8]
21. ans0(X0) [answer literal]
22. $false [unit resulting resolution 21,21,20]
% SZS output end Proof for test

-fde none, -av off and the conjecture are just "setup" and don't actually affect what I think is a bug. If you remove -ep RSTC it's satisfiable. Looking further...

Yeah, the answer-literal machinery doesn't check which polarity answer literals are, which causes trouble with equality proxy. I'm not sure what the right fix is here. Don't generate congruence axioms for answer literals? Don't resolve positive answer literals? Forbid equality proxy and answer literals used together? @selig or @hzzv - ideas welcome!

@stanmoon - you can't get an answer because this isn't a proof, just some kind of bug in the answer literal implementation. I think, at least. While we fix it, you could try the SMT-COMP schedules for getting a proper proof - I think this is more likely to give an answer because the majority of your problem is theory reasoning. Let me know if you want help with that.

@MichaelRawson - I'm preparing an interactive tutorial for a class on Intelligent Systems.
I thought it was an appropriate example for the students to model in TFF combined with question-answering.

What are the SMT-COMP schedules? (Is there a reference I could check --or is it an option in the command line arguments when invoking Vampire?)

Thanks for your help!

Great idea for the class! Let's hope we can make it work.

Vampire has a portfolio mode that tries lots of different strategies, as you discovered. Unfortunately this is also quite a good way to find corner cases, as you also discovered! We have different lists of strategies ("schedules") for different competitions: CASC is one such, SMT-COMP is another.

Unfortunately I can't find a way to invoke Vampire with the SMT-COMP schedule (which would usually be --mode smtcomp or --mode portfolio --schedule smtcomp) and TFF: the code expects an SMT-LIB logic, which of course you can't express in TPTP. You could either (i) write in SMT-LIB or (ii) I can try and find a strategy that works well for this kind of problem.

OK. Thanks!
Is this the language: http://smtlib.cs.uiowa.edu/language.shtml?
I'll have a look at it.

Yes, that's it. It's relatively simple compared to TPTP but also quite hard to read, subjectively.

hzzv commented

Re using equality proxy with answer literals: my workaround is to always use -ep off. (Meant to report/fix this some time ago but forgot about it, sorry!)

The issue here is that answer literals should never appear in the proof search with positive polarity -- but EP somehow introduces such literals. And then they get "resolved" in an unsound way.

selig commented

I suggest we promote question answering to a proper Vampire mode (not a separate option) where we hardcode equality proxy being off.

For the time being, I disabled equality proxy congruence axioms for answer predicates.

Solved by 918277d.