stevenvachon/broken-link-checker

Question: CLI Output broken links only?

JayHoltslander opened this issue · 3 comments

Currently running on my local machine with:

blc http://mymac.local:5757/blog -ro --exclude-external > output.txt

Is there a way I can modify the above command to only output broken links and ignore all "0 broken" links?
I checked --help to look for an option but didn't see a way to do this.

One way to do it is install the Programmatic API and write a script with your own instance of broken-link-checker; the blc command by itself might not be enough to do that since it shows them all by default.

In my script I used the HtmlUrlChecker class, and I modified the 'link' function. You could do something like this and you can further tweak it to check or display how you want it:

const htmlUrlChecker = new HtmlUrlChecker(options)
  .on('error', (error) => {})
  .on('html', (tree, robots, response, pageURL, customData) => {})
  .on('queue', () => {})
  .on('junk', (result, customData) => {})
  .on('link', (result, customData) => {
    if (result.broken) {
      console.log(result.brokenReason, result.url.resolved)
    }
  })
  .on('page', (error, pageURL, customData) => {})
  .on('end', () => {});

htmlUrlChecker.enqueue(pageURL, customData);

There is an issue already opened for this and it includes a few more solutions & workarounds: #133

And there is a push request #137 that implements this that came out of issue #133, that seems to have been ignored for over two years, as it does not have one comment.