paragonie/ciphersweet

[Documentation] Blind Index `type` Appears Unused

mcordingley opened this issue · 3 comments

I'm assuming that the type value from a blind index is useful somehow, as the docs tell me this is the case, but nowhere is it defined what the type is used for or how. From reading through the source, I see how the value is calculated but again see nothing that actually uses it.

Can you please update the documentation with how this value is used?

It's looking like the idea is for there to be a single filters table and for this value to be used to look up values from that table, e.g.

CREATE TABLE blind_indexes (
    type VARCHAR(255) NOT NULL,
    value VARCHAR(255) NOT NULL,
    foreign_id UNSIGNED INTEGER NOT NULL,
    INDEX (type, value)
    UNIQUE INDEX (type, foreign_id)
);

SELECT foreign_id
FROM blind_indexes
WHERE type = 'foo'
  AND value = 'bar'

If so, perhaps the documentation should go into further detail about this and guide users into possible implementations: on-record index values, per-table filters tables, and the unified filters table shown above.

I like the idea of sequestering these all into one table, but it presents a challenge that I would expect to be ignored by developers using this, i.e. how to garbage-collect values out of the table, as we can't really use foreign key constraints to pick them up automatically. The way that I see it, GC would need to cover removing indexes that have gone away (type), individual rows whose parent records have gone away (foreign_id), and individual rows whose indexed values have changed (value).

It appears the per-table filters tables implementation would suffer from the first and third types of orphans and that on-record filters would suffer from none.

The first two types of orphans could be cleaned up by a periodic job that calculates the current set of types and then removes all rows not within that set and then uses the table to type mappings to find all records that have a stale foreign_id. A unique index ought to prevent the third type from being possible by forcing the developer to upsert.

Edit: Pardon all of the updates. I figure it's better than spamming new messages as I refine things.

It's looking like the idea is for there to be a single filters table and for this value to be used to look up values from that table

The idea was to facilitate that design, if so desired. We strive to be vendor-agnostic with respect to database software.

Thanks. I will open other issues or PRs around making this a bit more user friendly.