replikativ/datahike

[Bug]: Inconsistent treatment of invalid constant values

Closed this issue · 2 comments

What version of Datahike are you using?

0.6.1551

What version of Java are you using?

openjdk version "20.0.1" 2023-04-1

What operating system are you using?

Ubuntu 22

What database EDN configuration are you using?

(let [schema {:name   {:db/unique :db.unique/identity}
                :friend {:db/valueType :db.type/ref}}
        db (d/db-with (db/empty-db schema)
                      [{:db/id 1 :id 1 :name "Ivan" :age 11 :friend 2}
                       {:db/id 2 :id 2 :name "Petr" :age 22 :friend 3}
                       {:db/id 3 :id 3 :name "Oleg" :age 33}])]
....

Describe the bug

I discovered an inconsistency in how queries are evaluated. In the file lookup_refs_test.cljc, there is the following test:

 (is (= (d/q '[:find ?e
                  :in $ [?e ...]
                  :where [?e :friend 3]]
                db [1 2 3 "A"])
           #{[2]}))

which passes.

Based on that test, I wrote this test, which also passes:

(is (= (d/q '[:find ?e
                  :in $ [?e ...]
                  :where [?e :friend 3]]
                db ["A"])
           #{}))

What is potentially problematic here is that ?e cannot be bound to the value "A" because it is not a valid entity id reference. The query engine chooses to handle this by ignoring the invalid binding and the result of the query becomes #{}. However, if I write a similar test,

(is (= (d/q '[:find ?e
                    :in $ ?e
                    :where [?e :friend 3]]
                  db "A")
             #{}))

I get this stack trace:

1. Unhandled clojure.lang.ExceptionInfo
   Expected number or lookup ref for entity id, got "A"
   {:error :entity-id/syntax, :entity-id "A"}
                utils.cljc:  123  datahike.db.utils$entid/invokeStatic
                utils.cljc:   97  datahike.db.utils$entid/invoke
                utils.cljc:  127  datahike.db.utils$entid_strict/invokeStatic
                utils.cljc:  126  datahike.db.utils$entid_strict/invoke
                query.cljc:  920  datahike.query$resolve_pattern_lookup_refs/invokeStatic
                query.cljc:  913  datahike.query$resolve_pattern_lookup_refs/invoke
                query.cljc: 1073  datahike.query$_resolve_clause_STAR_/invokeStatic
                query.cljc:  975  datahike.query$_resolve_clause_STAR_/invoke
                query.cljc: 1086  datahike.query$_resolve_clause$fn__36624/invoke
          query_stats.cljc:   31  datahike.query_stats$update_ctx_with_stats/invokeStatic
          query_stats.cljc:   19  datahike.query_stats$update_ctx_with_stats/invoke
                query.cljc: 1085  datahike.query$_resolve_clause/invokeStatic
                query.cljc: 1081  datahike.query$_resolve_clause/invoke
                query.cljc: 1083  datahike.query$_resolve_clause/invokeStatic
                query.cljc: 1081  datahike.query$_resolve_clause/invoke
                query.cljc: 1095  datahike.query$resolve_clause/invokeStatic
                query.cljc: 1088  datahike.query$resolve_clause/invoke
     PersistentVector.java:  343  clojure.lang.PersistentVector/reduce
                  core.clj: 6885  clojure.core/reduce
                  core.clj: 6868  clojure.core/reduce
                query.cljc: 1099  datahike.query$_q/invokeStatic
                query.cljc: 1097  datahike.query$_q/invoke
                query.cljc: 1234  datahike.query$raw_q/invokeStatic
                query.cljc: 1223  datahike.query$raw_q/invoke
                query.cljc:   80  datahike.query$q/invokeStatic
                query.cljc:   74  datahike.query$q/doInvoke
               RestFn.java:  439  clojure.lang.RestFn/invoke
     lookup_refs_test.cljc:  248  datahike.test.lookup_refs_test$fn__66025$fn__66058/invoke
     lookup_refs_test.cljc:  248  datahike.test.lookup_refs_test$fn__66025/invokeStatic
     lookup_refs_test.cljc:  186  datahike.test.lookup_refs_test$fn__66025/invoke
                  test.clj:  304  cider.nrepl.middleware.test/test-var/fn
                  test.clj:  303  cider.nrepl.middleware.test/test-var
                  test.clj:  293  cider.nrepl.middleware.test/test-var
                  test.clj:  336  cider.nrepl.middleware.test/test-vars/fn/fn/fn

What is the expected behaviour?

For the sake of consistency, I think this test should pass without throwing an exception when added to the test test-lookup-refs-query:

    (is (= (d/q '[:find ?e
                    :in $ ?e
                    :where [?e :friend 3]]
                  db "A")
             #{}))

The first step to make it pass would probably be to to remove the branch where :consts is accumulated in the function resolve-in.

How can the behaviour be reproduced?

See this commit on the branch jonasseglare/inconsistent-invalid-binding-demo for a unit test that demonstrates the issue. If you like I can open a PR from that branch so that we have this issue demonstrated in the source code.

Add this code to test-lookup-refs-query in lookup_refs_test.clj:

    ;; This test works, despite the fact that `?e` cannot be bound to `"A"`.
    (is (= (d/q '[:find ?e
                  :in $ [?e ...]
                  :where [?e :friend 3]]
                db ["A"])
           #{}))

   ;; This test currently throws an exception. Shouldn't it pass since the test above passes?
    (is (= (d/q '[:find ?e
                    :in $ ?e
                    :where [?e :friend 3]]
                  db "A")
             #{}))
whilo commented

That makes sense, I also think it is generally better to return nil than to throw errors for parametrized queries even though it is a bit concerning if you use the wrong type in the entity position, it is likely a bug from the user. So maybe throwing an error here is actually better for the user. Not sure. Either way it should be consistent.

As discussed this only occurs when someone makes an inconsistent query and therefore does not need a fix. It is nicer when the query engine just skips incorrect inputs than throwing exceptions.