additional URLs don't work
joyously opened this issue · 3 comments
This code gets the additional URLs into the $inclusion_candidates
variable, which is then never used. The code gets the Exclusions into the $inclusions
variable (why?) and then the loop duplicates any entries that are in the CrawlLog. Then those exclusions (with duplicates) are added to the CrawlLog and CrawlQueue.
What is supposed to be happening here?
I changed lines 62 and 64 like this, and it seems to work.
// $exclusions = Exclusions::getAll();
foreach ( $inclusion_cadidates as $inclusion ) {
I'm not sure if something was supposed to be done with the exclusions.
Thanks @joyously!
Definitely looks like an issue with the inclusions, but perhaps commenting out that Exclusions line will have some adverse effects.
May need a bit more work to ensure that both exclusions and inclusions work when set (ie, excluding a whole dir, then including a specific path from within that dir - IIRC, that's why intention was to run exclusions before inclusions).
perhaps commenting out that Exclusions line will have some adverse effects.
Yes, well it used to say $inclusions = Exclusions::getAll();
and that just didn't seem correct. So, once I changed it to use the $exclusions
variable, I could see that it was only adding to the log and queue, which shouldn't happen for exclusions, so I commented it out.