innoq/spacy

Have a real database in the background

Closed this issue · 2 comments

It would be nice if we could run spacy using a real database in the background instead of the in memory database that we are currently using.

When starting the Crux component:

(let [opts {:crux.jdbc/connection-pool {:dialect {:crux/module 'crux.jdbc.sqlite/->dialect}
                                        :pool-opts {}
                                        :db-spec {:dbtype "sqlite"
                                                  :dbname "sample.db"}}
            :crux/tx-log {:crux/module 'crux.jdbc/->tx-log
                          :connection-pool :crux.jdbc/connection-pool}
            :crux/document-store {:crux/module 'crux.jdbc/->document-store
                                  :connection-pool :crux.jdbc/connection-pool}}
      node (crux/start-node opts)]
  ,,,
  (assoc component :node node))

Seems to work locally. We still need to:

  • decide how we handle configuration, such as :db-spec values,
  • pass that :db-spec into the Crux component, and
  • add a corresponding JDBC driver in project.clj.

I'm open to ideas about how to best handle configuration. I think we should retrieve the things which are really secret from environment variables (like the DB password), but I'm not sure how to handle the rest of the config.

I see here that Crux seems to support mysql out-of-the-box: https://github.com/juxt/crux/blob/master/crux-jdbc/src/crux/jdbc.clj#L56

One idea I have is that we could modify our Crux component to always use crux.jdbc, but pass in the db-spec to the componentn on creation. Then we could create two system maps in spacy/system.clj, a dev-system and a prod-system. In the dev-system we could pass in a sqllite config, and for prod-system we could retrieve the db-spec config from an environment variable.