INL/BlackLab

Add a way to ask BlackLab for a specific set of hits

Opened this issue · 1 comments

For supporting custom hit-level annotations added by individual users (stored in a separate database), as well as parallel corpora, it would be very useful to be able to give a list of hits to BlackLab (document ids, start and end positions) and have it return exactly those hits with context.

This is preferable to using annotations such as word ids, because that would require you to search for potentially long lists of IDs (using a query like [id="0001|0002|0003|0004|..."]) which is not always feasible.

Suggested API addition: instead of a patt and filter, pass a parameter hitlist with a JSON value:

[
  {
    "docPid": "0001",
    "start": 123,
    "end": 456,
  },
  ...
]

BlackLab will return these hits in this order (unless sort or even a group was specified).