bcgov/TheOrgBook

Define a strategy for keeping the Solr search indexes and autocomplete results up to date.

Closed this issue · 8 comments

For performance reasons we have indexing on startup and real-time indexing disabled, which affects the availability of search results and the completeness if the autocomplete results.

The two appear they need to be updated together the indexes and then the autocomplete.

Define a strategy we can use moving forward. Perhaps a cron job? See von-bc-registries-agent as a reference. It uses go-crond which is OpenShift friendly.

Assigning to Wade. Andrew has made some improvements in the area, but the overall strategy is not defined or implemented.

I committed some updates to this today:

  • Multiple solr updates during a transaction are batched and duplicates removed (greatly reducing the number of updates required during credential processing)
  • Solr index updates are further queued by a worker and committed every 5 seconds
  • The suggester is rebuilt every 5 minutes or 1000 index updates, as needed, to provide up-to-date autocomplete results

Real-time indexing has been enabled in all environments. The latest updates to real-time indexing have been tested (with initial data load volumes) and the impact to performance has been greatly reduced.

That covers the search index side of the equation. I've talked with @CyWolf and @nrempel about the suggester implementation, we're still looking into it.

Is there a way to update the TOB docker-compose file to enable real time Solr updates? It's not working now, so when we add a permitify company and credential to TOB, it still shows 0 and 0 on the search screen. Love to be able to have that number updated.

@WadeBarnes @nrempel - any ideas?

@swcurran, It appears it should already be on; ENABLE_REALTIME_INDEXING=1. You should see a log at the very beginning of the tob-api startup saying Enabling realtime indexing .... That said it's not indexing that updates those numbers, that's a Postgres thing. The UI is getting cached results from Postgres and there may not have been enough changes to cause an analyse. @CyWolf may have some ideas around this.

@swcurran, Let me try updating the docker processes to do what I did here; #565

@swcurran, try the docker changes here; #574

I believe @CyWolf is working on the autocomplete portion of this.