Deleted samples present in ElasticSearch
Closed this issue · 3 comments
JimBacon commented
Samples e.g. with id 8029261, 11868076, 11966510, 18588200 are all marked deleted.
They are not represented in the ElasticSearch occurrence index but they are present in the ElasticSearch sample index.
Need to
- confirm records are removed from the sample index when deleted
- update the sample index to remove deleted records which should not be there.
JimBacon commented
This same issue has raised its head again in BiologicalRecordsCentre/ABLE#546
I've tracked it down to the Logstash configuration.
In samples-http-indicia.conf
each new record is given a unique id which is document_id => "iBRCSMP%{id}"
In samples-http-indicia-deletions.conf
we seek to delete records with document_id => "brc1|%{id}"
I will
- update the document_id in
samples-http-indicia-deletions.conf
to be the same as insamples-http-indicia.conf
- restart the deletion process so that it scans the entire index by deleting the
rest-autofeed-BRCSMPDEL
record from thevariables
table.
Note this only affects the sample index, not the occurrence index, so most reports/downloads are not affected.
JimBacon commented
Note
- I've decreased the period of the request for sample deletion from 15 minutes to 5 so that scan of the whole index completes more quickly.
- The Logstash service has to be restarted for config changes to apply.
JimBacon commented
Successfully completed and working as expected.