remarkjs/remark

Ampersand in link query strings unexpectedly escaped with backslash

delucis opened this issue · 5 comments

Initial checklist

Affected packages and versions

remark@14.0.3 — also tested with 13.0.0, 14.0.0, and 14.0.2 with the same results.

Link to runnable example

https://codesandbox.io/s/remark-ampersand-escaping-repro-6mm6xg

Steps to reproduce

  1. Create a Markdown string, containing a link with a query string. For example:

    const md = '[label](https://github.com/search?type=code&q=remark)';
  2. Process the string with remark:

    const file = await remark().process(md);
  3. Inspect the result:

    console.log(String(file));
    // [label](https://github.com/search?type=code\&q=remark)

Expected behavior

The original link and query string should remain unchanged.

Actual behavior

The & in the link’s query string is escaped with a \, breaking the link. (Perhaps this is intentional for a reason I don’t understand, but it was unexpected to me.)

Runtime

Node v16, v18, v20

Package manager

npm 8, pnpm

OS

Linux, macOS

Build and bundle tools

Rollup, esbuild, Parcel, Vite — tried a bunch of different environments to confirm behaviour.

wooorm commented

Hey!

This is intentional. &, depending on what comes after it, could turn into a character reference. E.g., when copy; comes after it, it turns into something else.

breaking the link

The link is not broken. Markdown parsers support escapes.


For more info, see for example syntax-tree/mdast-util-to-markdown#53.

Hi! This was closed. Team: If this was fixed, please add phase/solved. Otherwise, please add one of the no/* labels.

Thanks for the response! My rehype pipeline is still ending up with the bad link in the rendered content, but I’ll dig a bit deeper to figure out that bit. Nope, this also wasn’t true, things are actually working as expected, we just misattributed the real cause of an error to that sneaky looking \. Apologies for the disturbance and thanks again! 🙌

wooorm commented

No worries!

Could you share the actual problem with me? I'd appreciate hearing your root cause for future strugglers!

Mostly human error! 😅

We have a pipeline that pulls in some Markdown, processes with remark and a bunch of plugins, and then stores the resulting Markdown for rendering later. We were struggling with migrating some links where the search parameter syntax had changed and noticed “Oh there’s this \ in some of them after processing.” This led us to misattribute that as the cause for some link breaking (because naively copying a link like that does break) and debug to work out what was introducing the \ (assuming it to be one of our plugins, but later realising it was remark itself). You mentioned “Markdown parsers support escapes”, so that made me double check and realise this was the wrong rabbit hole and the final <a> in our rendered content was correct 😁

Anyway — appreciate your fast response. Helped get us back on track 💜