Url escapes are not normalized for local paths
BYK opened this issue ยท 7 comments
BYK commented
When a path has URL-unsafe characters, the URL to it gets escaped in the document but during link checking hyperlink
does not unescape them to match the actual path, leading to errors like this:
Error: bad links:
_next/static/chunks/pages/article/%5Bslug%5D-f92160effe6eedb195dc.js
The file _next/static/chunks/pages/article/[slug]-f92160effe6eedb195dc.js
indeed exists on the disk.
untitaker commented
Hm yeah. I think we need to urldecode both the path and the url for normalization
untitaker commented
I can get it fixed over the weekend, needs a new dependency on url. May want to copy lychees approach there which is a lot more sophisticated there iirc @mre
EDIT: I think we just need this: https://github.com/lycheeverse/lychee/blob/34f379319d09221094f9ce7a9761ef0e12d78c0e/lychee-lib/src/extract.rs#L168
untitaker commented
should be fixed in 0.1.18
mre commented
Looks like you found a more lightweight solution than url. Good work. ๐
untitaker commented
Yeah ignore the ping, i remember that lychee either had a distinct Link or Href type and wasn't really sure if I overlooked something but I think that should be fine for now.
BYK commented
This was super fast, thanks a lot @untitaker! I was hoping to get a patch out myself but I need to bump up my Rust skills first :)
untitaker commented
lmk if you hit any other issues and then which ones you intend to work on!
โฆOn Mon, Nov 15, 2021, at 10:39, Burak Yigit Kaya wrote:
This was super fast, thanks a lot @untitaker <https://github.com/untitaker>! I was hoping to get a patch out myself but I need to bump up my Rust skills first :)
โ
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub <#135 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAGMPRNNIZ6GTFBMNLXKRX3UMDIMHANCNFSM5H5APZDQ>.