Configurable naming of database columns

Question

Configurable naming of database columns

Opened this issue 10 months ago · 1 comments

Add an optional prefix pset_ (such as run(..., pset_prefix="pset_")) to all pset variables such that we have a clear distinction in the db:

book keeping
  _pset_id
  _run_id
  _pset_hash
  ...

pset content
  pset_foo
  pset_bar
  ...

results added by worker() and/or eval scripts. Users can here also add
prefixes as they wish, e.g. `postproc_` or `eval_`.
  baz
  boing
  ...

Answer 1 · 2024-03-07T19:22:36.000Z

Another option is a minimal naming convention where "prefix" fields such as _pset_id are added before calling a worker, which we do now. Then, users could be encouraged to name all returned fields in worker with a "postfix" notation such as result_ (inspired by how sklearn names estimator attrs after calling fit()). Then, all fields which don't have either an underscore pre- or postfix are considered pset params and will be used in hash calculations, should we need to re-calculate the hash of a database row. Still, the refresh tooling may have an additional option to specify the included fields, as outlined in #15.