Reindex documents into a new Solr on OJF
cdrini opened this issue · 8 comments
Subtask of #1067
- Create a Solr environment on server.openjournal.foundation
- Index latest openlibrary dump into OJF Solr
- [-]
Replay missing edits from Infobase onto OJF Solr until up-to-date (partially done)- Skipping: should be done once on ol-solr0
- Test OJF Solr by linking dev.openlibrary.org to OJF Solr
Can I work on this issue
So @cdrini can I collaborate with you ??
Hey @viragumathe5 ! Unfortunately this task is already underway (adding the WIP label!). There's already a pretty long backlog of related changes enqueued (#1843, #2246), so I can't think of a way to add you to this task :(
What type of things are you interested in? I'm sure we can find a good issue for you work on :)
No problem at all I just ask for the collaboration if required
I would like to contribute in any way like Documentation, CodeBase, etc
unable to do designing stuff :)
I feel lucky to work for Internet Archives
Thank You
Reindex complete; here are the numbers (using 2020-01-31 dump; and querying 02-14 solr for "before" values)
Type | # in postgres | # in old solr | # in new solr | psql diff | solr diff |
---|---|---|---|---|---|
Works | 18891263 | 16934104 | 18891032 | -231 | 1956928 |
Orphans | 3117594 | 2093485 | 3115125 | -2469 | 1021640 |
Authors | 7247819 | 6982935 | 7247631 | -188 | 264696 |
Subjects | 0 | 1514064 | 1514068 | 1514068 | 4 |
Reindex complete; here are the numbers (using 2020-02-29 dump; and querying 03-03 solr for "before" values)
Type | # in postgres | # in old solr | # in new solr | psql diff | solr diff |
---|---|---|---|---|---|
Works | 18895253 | 16937045 | 18895021 | -232 | 1957976 |
Orphans | 3116995 | 2093378 | 3114527 | 3114527 | 1021149 |
Authors | 7248307 | 6983408 | 7248115 | -192 | 264707 |
Subjects | 0 | 1514064 | 1514068 | 1514068 | 4 |
-> 3.2M records will be made visible 🎉 Next step: #1067