Negative lookahead seems non relational but possible

Question

Negative lookahead seems non relational but possible

Opened this issue 10 years ago · 7 comments

I've taken a few passes at implementing negative lookahead and thought I should report back on some of the difficulties I've had. My main focus has been modifying the code for positive lookahead to see if it would work for negative lookahead somehow. I was trying to model negative lookahead as the negation of positive lookahead and so I quickly got dragged into the nonrelational parts of core.logic. I'm no expert on either logic programming/constaint logic programming nor how instagenerate has been implemented so I'll try to keep from theorizing about the following:

Interesting definition for negative lookahead

(defmethod combinator-parseo :neg
  [{comb :parser :keys [hide] :as combinator} grammar strings parse-tree remaining-strings remaining-parse-tree]
  (let [comb (assoc comb :hide true)]
    (fresh [fake-remaining-strings]
      (!= strings remaining-strings)
      (!= parse-tree remaining-parse-tree)
      (partial-parseo comb grammar strings parse-tree fake-remaining-strings parse-tree))))

Simple grammar with interesting results:

(def grammar
  (insta/parser
    "S = !('B') ('A' | 'B' | 'C')"))

(run* [input]
  (fresh [abc]
    (instaparseo grammar input
                 [:S abc])))
;; (((B . _0) :- (!= (("A") (_1)))) ((B . _0) :- (!= (("C") (_1)))))

So in some weird sense instagenerate can do negative lookahead but you would have to carefully rip apart all the cons to get the correct results. It seems like if I understand how this all tied together it would be straightforward to pull these results out in the right way but so far it has escaped me.

In addition to this, I tried to project strings and check that the negative lookahead's parsers didn't successfully parse the value of strings. I don't believe I ever got strings to be ground though and so I nevere got anything but exceptions trying this. Also, I tried using nafc but it created similiar output as above that was far more verbose while being effectively the same.

Answer 1 · 2014-10-27T20:05:55.000Z

Thanks for taking a stab at this. I notice right off the bat that your "!="
statements should actually stay "==". Those statements are intended to
state "we're constraining stuff about the input, but we're not actually
eating characters while doing so." Thus, the strings == remaining strings,
and the parse tree == remaining parse tree. This is why I have "fake
remaining strings", so the parser can pretend it's eating characters but in
reality it isn't doing any damage to the input. As for the parse tree, I
just set :hide to true, and presumably it will already take care of not
affecting the parse tree.

Because negative look ahead ALSO doesn't eat characters (or parse tree), it
doesn't make sense to me to simply negate those statements. Somehow we have
to negate that "partial-parseo" statement instead.

On Monday, October 27, 2014, Citizen notifications@github.com wrote:

I've taken a few passes at implementing negative lookahead and thought I
should report back on some of the difficulties I've had. My main focus has
been modifying the code for positive lookahead to see if it would work for
negative lookahead somehow. I was trying to model negative lookahead as the
negation of positive lookahead and so I quickly got dragged into the
nonrelational parts of core.logic. I'm no expert on either logic
programming/constaint logic programming nor how instagenerate has been
implemented so I'll try to keep from theorizing about the following:

Interesting definition for negative lookahead

(defmethod combinator-parseo :neg
[{comb :parser :keys [hide] :as combinator} grammar strings parse-tree remaining-strings remaining-parse-tree](let [comb %28assoc comb :hide true%29]
%28fresh [fake-remaining-strings]
%28!= strings remaining-strings%29
%28!= parse-tree remaining-parse-tree%29
%28partial-parseo comb grammar strings parse-tree fake-remaining-strings parse-tree%29%29))

Simple grammar with interesting results:

(def grammar
(insta/parser
"S = !('B') ('A' | 'B' | 'C')"))
(run* [input](fresh [abc]
%28instaparseo grammar input
[:S abc]%29));; (((B . _0) :- (!= (("A") (_1)))) ((B . _0) :- (!= (("C") (_1)))))

So in some weird sense instagenerate can do negative lookahead but you
would have to carefully rip apart all the cons to get the correct results.
It seems like if I understand how this all tied together it would be
straightforward to pull these results out in the right way but so far it
has escaped me.

In addition to this, I tried to project strings and check that the
negative lookahead's parsers didn't successfully parse the value of
strings. I don't believe I ever got strings to be ground though and so I
nevere got anything but exceptions trying this. Also, I tried using nafc
but it created similiar output as above that was far more verbose while
being effectively the same.

—
Reply to this email directly or view it on GitHub
#1.

--Alex

Answer 2 · 2014-10-27T21:15:42.000Z

Yeah I understood a little bit better what I had been doing about twenty minutes later. The above code gives the complement of the set of results that we want. If we negate all the rules that I had above, or equivalently negate only the last constraint, then we should get what we are looking for. I'll look more into what's possible on this front then.

Answer 3 · 2014-10-27T22:09:36.000Z

Here is something that doesn't work but has the right spirit:

(defmethod combinator-parseo :neg
  [{comb :parser :keys [hide] :as combinator} grammar strings parse-tree remaining-strings remaining-parse-tree]
  (let [comb (assoc comb :hide true)]
    (fresh [fake-remaining-strings]
      (== strings remaining-strings)
      (== parse-tree remaining-parse-tree)
      (nafc partial-parseo comb grammar strings parse-tree fake-remaining-strings parse-tree))))

nafc stands for "negation as failure constraint", I believe, and should be similiar to what we want. If you look at the source for nafc then you'll see that it is only runnable when all the arguments are ground (line 2692). Additionally, after throwing in some println's, it sems that nafc only checks for runnablitly once. The previous example produces ((\A) (\B) (\C)), which is the same output as if the constraint wasn't run at all. I'm not sure if this isn't a bug or if it's the intended result. At the time it checks, comb and grammar are ground while the rest of the arguments are not. nafc is supposed to be delayed if any of the arguments aren't ground, but I'm not sure if delayed means it will be retried later (no evidence this is the case in the places I've looked so far) or if delayed means forgotten entirely.

Prayer to @swannodette: could you please clarify how nafc should work if the terms supplied to the constraint aren't ground?

Answer 4 · 2014-10-28T00:05:27.000Z

In my toy examples at the REPL, nafc seemed to work as expected. If the
nafc docstring is true, in this case it would never activate because
"fake-remaining-strings" is always fresh, so the constraint would never be
triggered.

Here is my version, which works sometimes but not always:
(defmethod combinator-parseo :neg
[{comb :parser :keys [hide] :as combinator} grammar strings parse-tree
remaining-strings remaining-parse-tree](let [comb %28assoc comb :hide true%29]
%28all
%28== strings remaining-strings%29
%28== parse-tree remaining-parse-tree%29
%28let [helper %28fn [strings]
%28fresh [fake-remaining-strings]
%28partial-parseo comb grammar strings parse-tree
fake-remaining-strings parse-tree%29%29%29]
%28nafc helper strings%29%29%29))

I'm using a closure there because I'm willing to execute the partial-parseo
check as soon as strings is ground. In reality, I would like to
short-circuit even earlier (e.g. if strings is '(\a . _0) and I'm doing a
negative lookahead on "a") but this is a reasonable compromise as I predict
it would work for most use cases.

This case doesn't work:
=> (run* [in out](instaparseo %28insta/parser)
in out))
([(\a) (:S "a")])

But this works:
=> (run* [in out](instaparseo %28insta/parser)
(seq "a") out))
()
Even more surprisingly, this doesn't work:
=> (run* [in out](instaparseo %28insta/parser)
in out)
(== in (seq "a")))
([(\a) (:S "a")])

--Alex

On Mon, Oct 27, 2014 at 3:09 PM, Citizen notifications@github.com wrote:

Here is something that doesn't work but has the right spirit:

(defmethod combinator-parseo :neg
[{comb :parser :keys [hide] :as combinator} grammar strings parse-tree remaining-strings remaining-parse-tree](let [comb %28assoc comb :hide true%29]
%28fresh [fake-remaining-strings]
%28== strings remaining-strings%29
%28== parse-tree remaining-parse-tree%29
%28nafc partial-parseo comb grammar strings parse-tree fake-remaining-strings parse-tree%29%29))

nafc stands for "negation as failure constraint", I believe, and should
be similiar to what we want. If you look at the source for nafc
https://github.com/clojure/core.logic/blob/fa9451ed57ba9647399e1e6e1d5a723e422d7c6b/src/main/clojure/clojure/core/logic.clj#L2680
then you'll see that it is only runnable when all the arguments are ground
(line 2692). Additionally, after throwing in some println's, it sems that
nafc only checks for runnablitly once. The previous example produces ((\A)
(\B) (\C)), which is the same output as if the constraint wasn't run at
all. I'm not sure if this isn't a bug or if it's the intended result. At
the time it checks, comb and grammar are ground while the rest of the
arguments are not. nafc is supposed to be delayed if any of the arguments
aren't ground, but I'm not sure if delayed means it will be retried later
(no evidence this is the case in the places I've looked so far) or if
delayed means forgotten entirely.

Prayer to @swannodette https://github.com/swannodette: could you please
clarify how nafc should work if the terms supplied to the constraint
aren't ground?

—
Reply to this email directly or view it on GitHub
#1 (comment)
.

Answer 5 · 2014-10-28T14:38:39.000Z

Here's a version of nafc that I've found useful:

(defn -my-nafc
  ([c args]
     (reify
       IConstraintStep
       (-step [this s]
         (reify
           clojure.lang.IFn
           (invoke [_ s]
             (when-not (tramp ((apply c args) s))
               ((remcg this) s)))
           IRunnable
           (-runnable? [_]
             (println "RUNNABLE?")
             (println (map #(ground-term? % s) args))
             (every? #(ground-term? % s) args))))
       IConstraintOp
       (-rator [_]
         `my-nafc)
       (-rands [_]
         (vec (concat [c] args)))
       IReifiableConstraint
       (-reifyc [_ v r s]
         `(my-nafc ~c ~@(-reify s args r)))
       IConstraintWatchedStores
       (-watched-stores [this] #{::subst}))))

(defn my-nafc
  "EXPERIMENTAL: negation as failure constraint. All arguments to the goal c
must be ground. If some argument is not ground the execution of this constraint
will be delayed."
  [c & args]
  (cgoal (-my-nafc c args)))

Nice thinking with the closure. In all the cases that didn't work, the constraint wasn't runnable because strings wasn't ground. This runs though:

(run* [in out]
  (== in (seq "a"))
  (instaparseo (insta/parser "S = !'a' 'a'") in out))

It looks like strings becomes ground via the first constraint and allows the nafc to be evaluated. Interesting that the order of the constraints matters here.

Answer 6 · 2014-10-28T16:52:52.000Z

I'm surprised that strings was not ground in those cases. Wasn't I setting
the value of strings when binding "in" to (seq "a")? I was setting it after
the constraint was established, however.

At this point I'm probably exhausting my own expertise of core.logic, since
I haven't gotten into the whole "constraint" side of things. I will need
further guidance to understand when constraints can trigger. (Maybe it is a
known limitation that constraints are doomed if the logic program goes out
of scope of the constraint?)

--Ale
x
On Tue, Oct 28, 2014 at 7:38 AM, Citizen notifications@github.com wrote:

Here's a version of nafc that I've found useful:

(defn -my-nafc
([c args](reify
IConstraintStep
%28-step [this s]
%28reify
clojure.lang.IFn
%28invoke [_ s]
%28when-not %28tramp %28%28apply c args%29 s%29%29
%28%28remcg this%29 s%29%29%29
IRunnable
%28-runnable? []
%28println)
(println (map #(ground-term? % s) args))
(every? #(ground-term? % s) args))))
IConstraintOp
(-rator []
my-nafc) (-rands [_](vec %28concat [c] args%29)) IReifiableConstraint (-reifyc [_ v r s] (my-nafc ~c ~@(-reify s args r)))
IConstraintWatchedStores
(-watched-stores [this] #{::subst}))))
(defn my-nafc
"EXPERIMENTAL: negation as failure constraint. All arguments to the goal cmust be ground. If some argument is not ground the execution of this constraintwill be delayed."
[c & args](cgoal %28-my-nafc c args%29))

Nice thinking with the closure. In all the cases that didn't work, the
constraint wasn't runnable because strings wasn't ground. This runs
though:

(run* [in out](== in %28seq))
(instaparseo (insta/parser "S = !'a' 'a'") in out))

It looks like strings becomes ground via the first constraint and allows
the nafc to be evaluated. Interesting that the order of the constraints
matters here.

—
Reply to this email directly or view it on GitHub
#1 (comment)
.

Answer 7 · 2014-10-30T17:22:32.000Z

I'm still trying to figure the terminology out but I think nafc is non-relational. core.logic is all about creating graphs of equality (relations). By introducing nonequality into the programs, we are no longer purely relational and so we run into situations where the semi-guarantees of relational programming (like the order of the constraints supposedly not changing the results) no longer hold. But yeah, I'm way outside my expertise already so far. I'll see how far I can get with just this for now.