danburzo/percollate

Error when handling a Table node when converting to markdown

dannyob opened this issue · 2 comments

Environment

  • Operating System: Linux 6.1.0-10-amd64 Debian 6.1.38-2 (2023-07-27) x86_64 GNU/Linux
  • node --version: v18.13.0
  • npm --version: 9.2.0
  • yarn --version, if using Yarn:
  • percollate --version: 4.0.2

Description

Hi! My percollate currently falls over on sites like Wikipedia when converting to Markdown:

: workboat  ~%; percollate md 'https://en.wikipedia.org/wiki/Danny_O%27Brien_(journalist)'                       
Fetching: https://en.wikipedia.org/wiki/Danny_O%27Brien_(journalist) ✓
Enhancing web page: https://en.wikipedia.org/wiki/Danny_O%27Brien_(journalist) ✓
file:///usr/local/lib/node_modules/percollate/node_modules/mdast-util-to-markdown/lib/index.js:113
  throw new Error('Cannot handle unknown node `' + node.type + '`')
        ^

Error: Cannot handle unknown node `table`
    at Object.unknown (file:///usr/local/lib/node_modules/percollate/node_modules/mdast-util-to-markdown/lib/index.js:113:9)
    at Object.one [as handle] (file:///usr/local/lib/node_modules/percollate/node_modules/zwitch/index.js:108:17)
    at containerFlow (file:///usr/local/lib/node_modules/percollate/node_modules/mdast-util-to-markdown/lib/util/container-flow.js:36:15)
    at Object.containerFlowBound [as containerFlow] (file:///usr/local/lib/node_modules/percollate/node_modules/mdast-util-to-markdown/lib/index.js:158:10)
    at Object.root (file:///usr/local/lib/node_modules/percollate/node_modules/mdast-util-to-markdown/lib/handle/root.js:22:13)
    at Object.one [as handle] (file:///usr/local/lib/node_modules/percollate/node_modules/zwitch/index.js:108:17)
    at toMarkdown (file:///usr/local/lib/node_modules/percollate/node_modules/mdast-util-to-markdown/lib/index.js:71:22)
    at bundleMd (file:///usr/local/lib/node_modules/percollate/index.js:647:13)
    at async generate (file:///usr/local/lib/node_modules/percollate/index.js:710:3)
    at async md (file:///usr/local/lib/node_modules/percollate/index.js:758:9)

Node.js v18.13.0
: workboat  ~%;

I think this is because the mdast markdown conversion only supports table with the help of another npm module: see syntax-tree/mdast-util-to-markdown#1

Good catch, @dannyob! We were already supposed to be using mdast-util-gfm, but there was a small typo that prevented it from being used.

Fixed in percollate@4.0.3.

Looking at the provided test case I can see potential problems with the way the whole remark pipeline takes the HTML verbatim and stringifies it to Markdown in ways that might trip some Markdown parsers, especially around <i><a></a></i>, so that might warrant further attention.

But the issue at hand should be solved.