Detect broken anchor links
Closed this issue ยท 10 comments
๐ Feature
We can detect broken links now, but another thing that can break easily is anchor links. As we currently auto-generate ids based on heading text, rewriting a heading can break linking, and it's not easy to notice it.
We need a way to check anchor links at build time too, fail fast, and report errors
That would be amazing! We were thinking about adding this to our documentation setup (I was initially looking into something like remark-validate-links
).
If anyone else is desperately waiting on this feature, I was able to get broken link detection set up in CI for my Docusaurus website using remark-validate-links
and a GitHub action.
- Here's my GitHub action YAML file (line 34-35 are the relevant lines)
- Here's my lint.sh file (very tiny, you could also skip this step by launching a NPM/yarn script in package.json from the GitHub action directly, but I wanted to keep it separate)
- Here's my remark config (line 16 is the relevant line, I also do a whole bunch of other checks)
Thanks @Zamiell that's useful to know how you made this work :)
We'll probably build something more "integrated" on top of the existing broken links checker (which only runs at built time), but having a remark linter could also be helpful to "pre-check" the links.
+1 for this feature,
Also slightly related (I can create a new ticket if you think it's adequate)
If your file is called docs index.md
and the link points to docs Index.md
it will work in OSX but it will fail in the linux CI ๐
I've tried to implement some workaround modifying this function and should not be very complicated. I think it would be great if docusaurus warned about this situation!
Thx for the nice work you do ๐
I would also like this feature.
In the meantime, IntelliJ tells me if an open markdown file has Unresolved file references. Maybe there is a way for me to add that to my CI.
For anyone who interested, I've created a plugin for remark-validate-links to support custom heading ID. Please refer to https://github.com/xiaogaozi/remark-validate-links-heading-id for more information.
Thanks for reporting @xiaogaozi
Note as part fo the MDX 2 migration (#8288), {#headingId}
is not a valid syntax anymore, and Titus suggests we use {/*#headingId*/}
instead.
It should also be possible to escape ids: \{#headingId}
Make sure to adapt once we upgrade to MDX 2
@slorber Actually I use markdownUtils.parseMarkdownHeadingId()
to parse heading, so maybe I just need upgrade @docusaurus/utils
package once Docusaurus upgrade to MDX 2?
Thanks to all the folks who contributed to this issue, I managed to implement the broken anchor links detection for both npm start
and npm run build
in the docs.knapsackpro.com repo.
Packages
npm install --save-dev concurrently
npm install --save-dev remark-cli
npm install --save-dev remark-validate-links
package.json
{
"scripts": {
"...": "...",
"remark": "remark",
"remark:once": "npm run remark -- --quiet --frail --use remark-validate-links docs/",
"remark:watch": "npm run remark -- --quiet --frail --use remark-validate-links --watch docs/"
"start": "concurrently \"npm run remark:watch\" \"npm run typecheck:watch\" \"docusaurus start\"",
"build": "npm run remark:once && npm run typecheck && docusaurus build",
}
Links
For Remark to do its job, you need to use relative file paths in markdown files, which is what Docusaurus recommends anyways (see File paths and URL paths and Markdown links).
Here's an example commit where we changed url paths to relative file paths.
As a user, I've looked at "how can I make sure the links in the docs are OK?" problem today, some findings:
-
I obviously expect that it's the documentation generator's job to check the links, but it looks like at this point I need to work-around this.
-
Using remark here feels like a fundamentally wrong approach --- documentation generator generates HTML, which is the ground truth. The right approach is to check the links in HTML --- that is guaranteed to be correct, and moreover is polymorphic in the input markup.
-
Turns out my local setup with remark actually did miss a couple of broken links exactly because what remark thinks the resulting HTML would be is different from the actual HTML
-
Luckily, there's this nice no-nonsense tools for checking links in the resulting HTML: https://github.com/untitaker/hyperlink
$ npx @untitaker/hyperlink ./build --check-anchors
Was exactly how I found some broken links missing by remark.