Rules [re]run before session updates
polvoblanco opened this issue · 4 comments
Hi,
It appears that rules are run multiple times before insertions made in a previous run have had time to impact the conditionals.
Below is a simple example that shows, despite conditions that should prevent the rule from running more than once, the rules run every time:
(defrecord RawData [value])
(defrecord ProcData [value])
(defrule proc-data
[RawData (= ?value value)]
[:not [ProcData (= value ?value)]]
=>
(prn "SET" ?value)
(insert-unconditional! (->ProcData ?value)))
(comment
(-> (mk-session 'clara-test.core)
(insert-all (mapv #(->RawData %) ["ab"
"abc"
"abcd"
"abc"
"abcde"
"abcd"
"abcdef"
"abc"
"abcdef"
"abcdef"
"abcdef"]))
(insert (->ProcData "abc"))
(fire-rules))
)
The results I am getting are:
"SET" "ab"
"SET" "abcdef"
"SET" "abcde"
"SET" "abcd"
"SET" "abcd"
"SET" "abcdef"
"SET" "abcdef"
"SET" "abcdef"
Where I wold expect to get:
"SET" "ab"
"SET" "abcdef"
"SET" "abcde"
"SET" "abcd"
I have also tried it with the rule:
(defrule proc-data-2
[RawData (= ?value value)]
[?proc-vals <- (acc/all :value) :from [ProcData]]
=>
(when-not (some #(= ?value %) ?proc-vals)
(prn "SET2" ?value)
(insert-unconditional! (->ProcData ?value))))
And get:
"SET2" "ab"
"SET2" "abcdef"
"SET2" "abcdef"
"SET2" "abcdef"
"SET2" "abcdef"
"SET2" "abcd"
"SET2" "abcde"
"SET2" "abcd"
Again, I would not expect to see any duplicates. So, it seems that despite the RHS inserting a record, that insertion is not present by the time the rule runs again.
Any ideas?
Thanks,
Paul
@polvoblanco,
Based on:
(mapv #(->RawData %) ["ab"
"abc"
"abcd"
"abc"
"abcde"
"abcd"
"abcdef"
"abc"
"abcdef"
"abcdef"
"abcdef"])
It looks like you are inserting duplicates.
I don't believe that Clara makes any provisions to deduplicate similar facts being inserted, as duplicate facts might be intended by consumers.
Hi, thanks for the quick response.
The data that is passed into Clara will contain duplicates, obviously in the above example they are redundant but that was created purely to show the problem, in the real system they are taken from much larger records where other parts of the data are distinct.
The issue I'm having is that the value is inserted unconditionally into ProcData and that is then used to prevent the rule either firing (in the first example) or re-inserting (although the RHS will fire) in the second example. The second example I feel is more telling as it shows that when the rule is run for the second, third, forth... time the records have not been updated from the previous runs.
Thanks again,
Paul
There is no attempt by the rules engine to determine what a "duplicate is" and if that does or doesn't make sense for your domain.
I also, am generally not a fan of using insert-unconditional!
for most rule patterns since it then creates a situation where the firing order of your rules are order-dependent. There is often a solution that works without "unconditional" that can take advantage of the built in truth-maintenance system (aka the TMS), which allows rules to evaluate as a declarative style logic expression where you do not need to be concerned with "when things fire" at the evaluation level.
That said. A common pattern to "remove duplicates" is to accept that there are rule matches that result in duplicates possibly from the input data. So have a rule that creates the possible duplicates use some "intermediate type fact" that is then accumulated by a second rule. The accumulation rule can then choose how to take possible duplicates and "narrow it" down to a single fact that other rules can use - duplicate free. This essentially means that the rule writer decides what a duplicate means and how to "de-duplicate" it if they have that need (eg. the "cardinality of matches" isn't important for their rule processing).
Here is rough example (note I haven't ran it, so sorry if there are small issues):
(defrecord RawData [value])
(defrecord ProcData [value])
;; "intermediate fact" that may have "duplicates"
(defrecord ProcDataMatch [value])
(defrule proc-data-match
[RawData (= ?value value)]
;; Some criteria to join/enhance a `RawData` into a (possibly duplicated ProcDataMatch)
=>
(insert! (->ProcDataMatch ?value)))
(defrule proc-data
[?proc-matches <- (acc/all) :from [ProcDataMatch (= ?value value)]]
=>
;; Assuming any `ProcDataMatch` with the same `value` can be treated the same, just use `first` to
;; take one.
(insert! (->ProcData (first ?proc-matches))))
(comment
(-> (mk-session 'clara-test.core)
(insert-all (mapv #(->RawData %) ["ab"
"abc"
"abcd"
"abc"
"abcde"
"abcd"
"abcdef"
"abc"
"abcdef"
"abcdef"
"abcdef"]))
(insert (->ProcData "abc"))
(fire-rules))
)
Hi, thanks @mrrodriguez that does the trick, guess I still have a lot more to get my head around.
Cheers, Paul