CottageLabs/OpenArticleGauge

Invalidating licenses - need to repeatedly try

Opened this issue · 2 comments

It seems possible to invalidate sets of up to ~10,000 licenses with the service reporting back that it has done them. However repeating the search shows a set remaining and it seems necessary to repeatedly do the invalidation to get the numbers down to zero.

[18:04:10] Emanuil Tolev: I figured out the problem too, besides the 30k simultaneous connections to Redis
[18:04:19] Emanuil Tolev: it deletes half of the remaining unknowns
[18:04:40] Emanuil Tolev: should work ok for those from the web UI, but batching tens of thousands results in that
[18:05:04] Emanuil Tolev: at least it's probably nothing intractable, Richard will spot quickly I'm sure

It always deletes (roughly) half of the remaining ones. I ran it a lot with -a -u, so all unknowns.

The last deletes were: deleted 2000, deleted 1200, nothing to delete.

The reason for this is probably due to the way that the iterator works in the back-end. Because it's iterating over a changing result set it's essentially impossible to be sure that it sees every record, because they move around while the cursor is trying to page through them.

The iterator works perfectly well when adding new records, but it doesn't like records changing.

The fix for this is either to pre-load all the identifiers before invalidating them from a set in-memory, or looking into whether elasticsearch's scroll-search can fix the problem.