allowedDomains and deniedDomains parameters are not work
liufuhu opened this issue · 2 comments
liufuhu commented
What is the current behavior?
I set like this:
crawler.queue({
url: 'https://a.example.com',
obeyRobotsTxt: false,
maxDepth: 2,
depthPriority: true,
allowedDomains: [/example\.com$/],
});
but the argument options of function _checkAllowedDomains
in the lib/hccrawler.js, the options does not include the allowedDomains
setting.
If the current behavior is a bug, please provide the steps to reproduce
It is a bug, please check. The package version is 1.8.0
BubuAnabelas commented
The code looks good so I think it should work like that.
The _checkAllowedDomains
function passes the array to the checkDomainMatch
function that makes a regular JS regex test against the requested url.
If you found out what's wrong please post it here.