Couldn't keep to the domain
mgifford opened this issue · 3 comments
When I added -a, but I should be able to do either:
-a, --additional
With
node --max-old-space-size=6000 --no-deprecation purple-a11y/cli.js -u https://www.whitehouse.gov -c 2 -s same-domain -p 50 -a none --blacklistedPatternsFilename ./pa-gTracker-exclude-medicare.csv -k "Random Example:random@example.com"
It ran fine, but I found sub-domains in the returned results.
Hi @mgifford,
-s same-domain
will result in scan results from any sub-domain of the parent domain. E.g. scanning -s same-domain -u tom.example.tld
will make it possible to scan jerry.example.tld
and example.tld
.
If you wish to stick to just scan tom.example.tld
, then specify -s same-hostname -u tom.example.tld
.
Marking this issue as closed. Let me know if you encountered otherwise. :)
I read this completely wrong:
-s, --strategy Strategy to choose which links to crawl in
a website scan. Defaults to "same-domain".
[choices: "same-domain", "same-hostname"]
It's not useful to go into a discussion of domain vs hostname, but maybe it is possible to change the help text.
Maybe something like:
-s, --strategy Crawl specific hostname or more general domain
Defaults to "same-domain", which includes sub-domains.
[choices: "same-domain", "same-hostname"]
Maybe it's just me though @younglim