/metalsmith-broken-link-checker

Metalsmith plugin to check for internal broken links

Primary LanguageJavaScriptMIT LicenseMIT

metalsmith-broken-link-checker

npm version Build Status Dependency Status

Metalsmith plugin to check for internal broken links

About

Small typos can often result in unexpected broken links. This plugin aims to catch them as early as possible.

It checks for relative and root-relative links which do not have a corresponding file in the Metalsmith pipeline. It (currently) ignores all absolute links.

Any broken links will cause an Error to be thrown (or a warning to be printed if options.warn is set).

By default, all href attributes of <a> tags and all src attributes of <img> tags are checked.

This plugin uses cheerio to find link and image tags and urijs to manipulate URLs.

I also wrote a blog post about what I learned developing this plugin.

Example

In your Metalsmith source dir, you have the following file (dir1/test-file.html):

<!DOCTYPE html>
<html>
<body>

  <a href="/a.html">(Root-relative link) Error if 'a.html' not in files</a>

  <a href="a.html">(Relative link) Error if 'dir1/a.html' not in files</a>
  <a href="./a.html">(Relative link) Error if 'dir1/a.html' not in files</a>
  <a href="../a.html">(Relative link) Error if 'a.html' not in files</a>

  <a href="/">(Root-relative link to dir) Error if 'index.html' not in files</a>
  <a href="dir2/">(Relative link to dir) Error if 'dir1/dir2/index.html' not in files</a>

  <a href="#fragment">(Hash fragment link) Always valid</a>
  <a href="/dir2/#fragment">(Hash fragment link) Error if 'dir2/index.html' not in files</a>

  <a>Missing href attribute, always broken</a>

  <img src="testimg.jpg" alt="(Relative link) Error if 'dir1/testimg.jpg' not in files">
  <img src="/testimg.jpg" alt="(Root-relative link) Error if 'testimg.jpg' not in files">

</body>
</html>

Note that links to directories are allowed if they have a trailing slash (the index.html file will be looked for). However links to directories without a trailing slash are not allowed unless the allowRedirects option is set.

Installation

$ npm install --save metalsmith-broken-link-checker

CLI Usage

In metalsmith.json:

{
  "source": "src",
  "destination": "build",
  "plugins": {
    "metalsmith-broken-link-checker": true
  }
}

JavaScript Usage

var Metalsmith = require('metalsmith')
var blc = require('metalsmith-broken-link-checker')

Metalsmith(__dirname)

  // Build your full site here...

  .use(blc(options))

  .build()

Options

allowRegex (optional, default: null )

  • Optional regex gets matched against every found URL
  • Use it if you want to allow some specific URLs which would otherwise get recognised as broken

allowRedirects (optional, default: false )

  • If false then links to directories will only be allowed if the link ends with a trailing slash (e.g. dir1/)
  • If true then links to directories will be allowed with or without a trailing slash (e.g. dir1/ or dir1)

allowAnchors (optional, default: true )

  • An anchor is an <a> tag with a name attribute but no href attribute (used for jumping to elements on a page with hash fragments)
  • For example, with allowAnchors set to true, the following would be allowed: <a name="anchor">Anchor text</a>

checkImages (optional, default: true )

  • Specifies whether or not to check src attributes of <img> tags

checkLinks (optional, default: true )

  • Specifies whether or not to check href attributes of <a> tags

checkAnchors (optional, default: false )

  • Specifies whether or not to check the validity of hash fragments in links
  • For example file.html#someid could link to a valid file but the file content could be missing the someid id or name on an element

warn (optional, default: false )

  • If false then throw an Error when encountering the first broken link
  • If true then print warnings to stderr for every broken link

baseURL (optional, default: null )

  • Should start with a slash, e.g. /base
  • Meant for sites which will be hosted within a subdirectory of another site
  • For example, if the output of your Metalsmith build will be hosted at http://example.com/base/ then links to /base/dir/file.html will be valid if dir/file.html exists in the metalsmith pipeline

History

  • 1.0.2 Handle spaces in filenames correctly
  • 1.0.1 Add filenames to warnings
  • 1.0.0
    • CoffeeScript -> JavaScript rewrite
    • Requires Node v6+ to run (breaking change)
    • Add anchor target checking
  • 0.1.11 Allow anchors with id attributes
  • 0.1.10 Add baseURL option
  • 0.1.9 Add CI badges to README
  • 0.1.8 Add allowRedirects option
  • 0.1.7 Change URIjs to urijs
  • 0.1.6 Fix forward slash Windows path bug
  • 0.1.5 Add allowAnchors option
  • 0.1.4 Normalize file paths for Windows
  • 0.1.3 Fail in the correct way for missing href attribute
  • 0.1.2 Updated README
  • 0.1.1 Updated README
  • 0.1.0 First release