Reassess harvest file storage and usage
Closed this issue · 2 comments
GoogleCodeExporter commented
A number of issues here. Current state of affairs:
* Harvest files are stored on disk, edited by sysadmins there, but not used from there.
* Housekeeping periodically syncs this directory with storage.
* Storage OID algorithm uses in-built system utility to combine hostname, username and file path. Causes migration issues.
* Individual objects have references to their harvest file OIDs stored against them.
* On first execution the Solr Indexer will cache an instantiated rules file in memory for performance.
* There are multiple instantiated Indexers throughout the system. This setup allows for different versions of cached rules file claiming to be the same OID over time.
* Once a rules file is cached, there is no point to synching it from disk as it will never be used until the system restarts.
Suggested changes:
* Harvest files should use a different OID method... this is just to simplify migrations. Arguably they wouldn't even need to be stored in storage, but accessed direct from disk.
* Indexer caches shouldn't be held unchanged for so long. Some periodic or automatic updates should exist.
Original issue reported on code.google.com by greg.pen...@gmail.com
on 6 Sep 2011 at 1:01
GoogleCodeExporter commented
Original comment by greg.pen...@gmail.com
on 23 May 2012 at 1:11
- Added labels: Type-Task
- Removed labels: Type-Defect
GoogleCodeExporter commented
Migrated to https://github.com/the-fascinator/the-fascinator/issues/7
Original comment by duncan.q...@gmail.com
on 14 Jan 2013 at 5:19
- Changed state: MovedToGitHub