tcort/markdown-link-check

403 for github.com because it blocks requests without `--compressed` curl flag

dg-nvm opened this issue · 4 comments

λ docker run --rm -v <PATH>:/directory -w /directory ghcr.io/tcort/markdown-link-check:latest doc/TODO.md

FILE: doc/TODO.md
  [✖] https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/about-code-owners

  1 links checked.

  ERROR: 1 dead links found!
  [✖] https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/about-code-owners → Status: 403

Local curl:

λ C:\Users\dawid.goslawski\scoop\shims\curl.exe -f https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/about-code-owners -o nul
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
curl: (22) The requested URL returned error: 403

Curl with --compressed:

λ C:\Users\dawid.goslawski\scoop\shims\curl.exe --silent --compressed -f https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/about-code-owners -o nul && echo "OK"
"OK"

As a workaround, you can add the following to the config file:

  "httpHeaders": [
    {
      "urls": ["https://docs.github.com/"],
      "headers": {
        "Accept-Encoding": "zstd, br, gzip, deflate"
      }
    }
  ]

as seen here: https://github.com/openmrs/openmrs-esm-core/pull/412/files

Would be a good option to add the header "Accept-Encoding: zstd, br, gzip, deflate" by default? I suppose the response body is not used. I assume you only check the response code.

I know this is a specific case (because GitHub requires that header) but I do not see any side effects for other cases.

The workaround provided by @brandones works on non-redirected links 👍
For those links with 301/302 redirection, will still get 403 and end up in dead status here 😅 (I missed one domain in the config, please forgive me 😅)

Set those accepted compression(encoding) header as default looks good to me and may be the solution 😄

gwarf commented