oracle-samples/clara-rules

Incorrect truth maintenance for equal tokens from the same production

Closed this issue · 11 comments

I would expect the following test to pass:

(deftest test-duplicate-insertions-with-only-one-removed
  (let [r (dsl/parse-rule [[LousyWeather]]
                          (insert! (->Cold 10)))
        q (dsl/parse-query [] [[Cold (= ?t temperature)]])
        query-session (fn []
                        (-> (mk-session [r q])
                            (insert (->LousyWeather))
                            (insert (->LousyWeather))
                            (fire-rules)
                            (retract (->LousyWeather))
                            (fire-rules)
                            (query q)))]
    (is (= [{:?t 10}] (query-session))
        "Removal of one duplicate fact that causes a downstream rule to fire should not
         retract insertions that were due to other duplicate facts.")))

Instead, it fails to find any Cold facts:

FAIL in (test-duplicate-insertions-with-only-one-removed) (test_rules.clj:3414)
Removal of one duplicate fact that causes a downstream rule to fire should not
         retract insertions that were due to other duplicate facts.

expected: [{:?t 10}]
  actual: ()

expected: nil to be an instance of clojure.lang.PersistentVector
     was: nil is an instance of nil

I believe the root cause of this can be found at [1] in memory.cljc. Specifically,
the facts inserted due to a token are simply added to the facts under that token:

(assoc token-facts-map token (into previous-facts facts))

When that token is then retracted, every fact inserted due to that token is then returned
for retraction through the relevant alpha nodes:

(select-keys token-facts-map tokens)

The problem with this is that the sequence of facts under a token in the token-facts-map
may not in fact correspond to a single firing of a rule. Facts from multiple firings of a rule will
be grouped under the same entry if the tokens that caused that rule activation are equal. When one of those facts
is retracted all the facts that resulted from tokens equal to that one will be retracted, even if the other tokens
are still present. For the purely "forward" path (i.e. without retractions) the various remove-first-of-each!
calls should handle equal tokens like this correctly.

I see several possible ways to resolve this:

A. The engine could be changed in such a way that the tokens from different rule activations
are never equal. This seems like a major change, particularly given the way tokens are built
to retract other tokens that are equal to the built one; [2] is one example.

B. Group the facts under an "activation ID" in the memory and only return one grouping by activation
ID per token when remove-insertions! is called.

C. Modify insert-facts! so that even when it is called multiple times in the RHS, add-insertions!
in the memory is only called once. This requires grouping in the memory as well, but it could
just be per-call rather than requiring an additional ID.

I have written prototypes for B and C, at [3] and [4] respectively. Please note that I'm not suggesting
that either be merged now; I believe there is still some work to do on both. I'd like to get
some feedback before getting too much deeper into implementation. The basic advantage of B is that
it avoids adding state; the disadvantage is that it pushes changes further into the engine than C. The
advantage of C is that it restricts the area of code that changes, but it introduces a new layer
of stateful indirection that B avoids. Perhaps A is more practical than I believe or you see an entirely different option
though.

Also, I haven't given much consideration to how this will impact the inspection namespace or clara-tools
(or for that matter how they are handling the case now). That is probably worth giving some thought to.

  1. https://github.com/rbrush/clara-rules/blob/0.10/src/main/clojure/clara/rules/memory.cljc#L270
  2. https://github.com/rbrush/clara-rules/blob/0.10/src/main/clojure/clara/rules/engine.cljc#L392
  3. WilliamParker@52e1183
  4. WilliamParker@ea02290

I see that those two branches I pushed have failing builds in Travis CI:

https://travis-ci.org/WilliamParker/clara-rules/builds/108717659
https://travis-ci.org/WilliamParker/clara-rules/builds/108716880

I'm not sure why my local builds passed. At first glance the errors appear to be things that are probably unrelated to the core logic/design considerations here, but I'll need to look at them (although not tonight).

To give my thoughts:

ignoring option A with the assumption it is s more drastic change than we probably would make right now, I like option C. It minimizes the changes and functions involved in maintaining the truth maintenance memory for insertions.

It keeps it closer to memory. My thoughts are that we may circle back and try to structure this a bit differently later, but we should probably fix the defective behavior this can have around duplicate fact inserts currently.

Considering option C, what happens if some token X gets inserted multiple times, then retracted multiple times? I'm thinking of the following flow:

  1. Token X activates a rule that inserts facts A B C, add-insertions! A B C
  2. Token X' (equal to X) activates the rule, inserts fact A' B' C', does not add more insertions
  3. Retract token X, the first A B C is retracted, remove-insertions A B C from memory
  4. Retract token X', then A' B' and C' should be retracted as well, but with Option C we didn't add-insertions! so we wouldn't know to retract them.

Is this reasoning accurate? If so, option B seams preferable to me.

A variation of option B occurred to me that might be a bit simpler. Rather than having an "activation ID", gather all the facts inserted during and activation and change add-insertions! to keep them as as a list. So if Token X and Token X' are both inserted, add-insertions! keeps separate lists of the facts inserted by each. Since the tokens are identical these lists should be the same. When some Token X is retracted, we just retract one of the lists, and leaving the other.

This basically keeps facts that were inserted during an activation together, and retracts them together distinctly from equivalent activations, but doesn't require an identifier to correlated them. (Hopefully this is making sense. ;)

@rbrush I won't speak completely for Will, but scenario C above is intended to implement what you said here
#171 (comment)

All insertions in this "cache" (I think a better name may be "batch"?) are in the scope of a single activation of the rule. The insertions are cleared out and stored into memory for a single instance of a duplicate token before the next token comes into the picture. This means that the Token X and X' both do batch their facts associated with them in their single associated rule activation.

Basically, I think option C is supposed to handle the case described @ #171 (comment) unless I'm missing something in how it is currently implemented. I think I'd have a few notes on the specific implementation, but overall I think its idea addresses this.

Will pointed out to me yesterday that a rule activation may not be "done" creating insertions all in one go. Unfortunately this is the case because there is no currently good way to enforce otherwise.

e.g.

(defrule n
<LHS>
=>
(doseq [x ?stuff] (insert! x))) ;; Should have done (insert-all! ?stuff) or (apply insert! ?stuff)

This causes insertions to come an unbounded number of times within a single activation. That is what this "cache" is for. Once the RHS for this activation is done, the cache is cleared out and the truth maintenance memory is updated for that specific token. If another equal token comes, it will batch facts and put them in memory again after its activation. When retracting, we don't really know which batch to retract since the tokens are equal, so we just retract one of the groups from memory.


Ideally, we would actually have a system in place that truly tracks facts in memory and the token they were directly associated with - beyond value equality. That way we could have a cleaner picture and can remove precisely the information that was put in for a given element or token being retracted. I think that is a larger undertaking though and maybe out of scope.

I do plan to log a follow on issue related to this topic though because there are some oddities that can happen around objects with different metadata, but that are equal with how they are added and removed to memory and to truth maintenance memory.

I think this is its own topic though, and I want to provide examples. This issue stands in the way of that somewhat though due to these strange properties around equality involved with truth maintenance right now that we are addressing here.


Sorry for the lengthy post!

Ah, in that case +1 for option C. I had misunderstood the approach.

On Fri, Feb 12, 2016 at 11:49 AM Mike Rodriguez notifications@github.com
wrote:

@rbrush https://github.com/rbrush I won't speak completely for Will,
but scenario C above is intended to implement what you said here
[#171 (comment)
https://github.com//issues/171#issuecomment-183355824]

All insertions in this "cache" (I think a better name may be "batch") are
in the scope of a single activation of the rule. The insertions are cleared
out and stored into memory for a single instance of a duplicate token
before the next token comes into the picture. This means that the Token X
and X' come both do batch their facts associated with them in their single
rule activation.

Basically, I think option C is supposed to handle the case described @ [#171
(comment)
https://github.com//issues/171#issuecomment-183332347]
unless something is missing there.
I think I'd have a few notes on the specific implementation, but overall I
think its idea addresses this.

Will pointed out that a rule activation may not be "done" creating
insertions all in one go. Unfortunately this is the case because there is
no currently good way to enforce otherwise.

e.g.

(defrule n

=>
(doseq [x ?stuff](insert! x)))

This causes insertions to come an unbounded number of times within a
single activation. That is what this "cache" is for. Once the RHS for htis
activation is done, the cache is cleared out and the truth maintenance
memory is updated for that specific token.


Reply to this email directly or view it on GitHub
#171 (comment).

I basically agree with the comment above by @mrrodriguez .

Stepping back from the specific implementations, I think we either need to guarantee that mem/add-insertions! is called at most once per rule activation or give mem/add-insertions! visibility to what inserted facts actually come from the same rule activation.

For the cache option (C), the add-insertions! would be called immediately after the eval'ed RHS function returns. All inserted facts would be cached in a stateful construct until then when inserted in the RHS. Regarding your comment:

Token X activates a rule that inserts facts A B C, add-insertions! A B C
Token X' (equal to X) activates the rule, inserts fact A' B' C', does not add more insertions

I don't believe the "does not add more insertions" part is correct. I'd expect those insertions to be added. Did you see some reason that add-insertions! wouldn't be called again?

For option B, each individual activation of a rule would be assigned an ID, but this is purely a mechanism for grouping facts by the activation they came from; at least in the version I wrote there was no lookup by this activation ID later. It is just a grouping mechanism that works when mem/add-insertions! can be called more than once per rule activation. If mem/add-insertions! is only called once per rule activation then a sequence works.

I'll work on a more polished pull request for this based on option C. @rbrush , @mrrodriguez , or anyone else looking at this, feel free to look at the draft and give feedback but waiting for a more polished version before spending time looking in detail would be completely reasonable.

I have submitted a pull request for this at #173

Some highlights:

  • The return value of memory/get-insertions changed to return a sequence of sequence of facts, with each representing a single rule activation. I also added a get-insertions-all function to the IMemoryReader protocol that returns the map from tokens to sequences of insertion groups. I don't see a good way to avoid changing memory signatures here since the idea of a map from tokens to flat fact sequences doesn't work when the facts have logical groupings (in this case by what RHS activation they come from). However, this would potentially be a breaking change to any tooling using the memory directly. I fixed the inspect and durability namespaces to use the new signatures.
  • I moved the RHS retract! functionality from clara.rules into the engine; the function in clara.rules is now a passthrough. It seemed like a function relying on engine internal belonged there. I also added tracing to it. Since insertions use the same tracer for unconditional insertions in the RHS and top-level insertions called on the session [1, 2], I did the same here (that is, used the same tracer call that top-level retractions use). All retractions are then batched and done after all insertions from a RHS. This is necessary since the fact to be retracted might have been caused by an earlier insertion (that the rules engine now batches to occur later). Since the retraction will still happen before any other RHS'es are triggered due to these actions, I believe this is still logically valid. That said, there could be an edge case I'm missing somewhere. In principle there might be performance impacts due to reordering, but in that case users would have been relying on the ordering in the engine anyway. I believe in some edge cases this does rely on retractions of a fact that isn't actually present not throwing an exception; I added a test for this in test-rules.
  • A manual test of clara-tools with the clara.examples.tools ns shows it working with these changes on my machine. Since it uses the inspect namespace [3] and this change is passive to the type signature of clara.tools.inspect/inspect (albeit with changes to the actual output for the cases that were originally broken here) this makes sense. It looks to me like clara-tools has other problems with equal facts, but this at least shouldn't cause additional problems and fixing those other problems seems out of scope here.
  1. https://github.com/rbrush/clara-rules/blob/0.10/src/main/clojure/clara/rules/engine.cljc#L221
  2. https://github.com/rbrush/clara-rules/blob/0.10/src/main/clojure/clara/rules/engine.cljc#L1093
  3. https://github.com/rbrush/clara-tools/blob/master/src/main/clojure/clara/tools/ui/session.clj#L70

Wow, very nicely done. I really appreciate the detailed description and agree with your reasoning, and the code looks good after an initial pass. I'll look more closely at this today but I don't expect to find anything too major.

I don't think the changes in in the memory namespace will break anyone, but we'll bump the version to 0.11.0 and mention it in the change log just to keep things visible.

As for clara-tools, that remains more of a part-time hobby/experiment. I had planned on doing some significant refactoring that changes it anyway, so additional issues with clara-tools can be deferred until later.

Merged!

The code looks good and clara-benchmark didn't show any performance regression. Many thanks for this great contribution!