micromark/micromark-extension-frontmatter

Ability to specify the parser via the matter

balupton opened this issue · 5 comments

Subject of the feature

I'd like for the Matter to specify its parser, rather than it being done via a configuration option.

Problem & Expected behavior

Using three backticks to make the fence which is most suitable for markdown to maintain syntax highlighting in editors and on github.

Using the default parser, assumed to be yaml:

```
some: data
```

# a h1

Specifically stating yaml:

``` yaml
some: data
```

# a h1

Specifically stating json:

``` json
{ "some": "data" }
```

# a h1

This allows my documents to use the parser that is most applicable to their specific needs, rather than one parser for all.

Alternatives

Refer to discussion at mdx-js/mdx#1315

@balupton I'm not sure I follow, all your examples are code fences.
Would it make sense to use code fences directly rather than making frontmatter work as a code fence?

@balupton I'm not sure I follow, all your examples are code fences.

having the front matter as a markdown code fence for markdown documents is a personal preference, as that supports the builtin syntax highlighting abilities, without having to create an entire new ecosystem around supporting something that markdown doesn't normally support

https://github.com/balupton/website/blob/2e9d2c131d4019126452b4314d926a0811c87aa9/pages/blog/contributor-friendly.mdx

https://raw.githubusercontent.com/balupton/website/2e9d2c131d4019126452b4314d926a0811c87aa9/pages/blog/contributor-friendly.mdx

This is also the reason why https://github.com/bevry/docmatter supports fences that have a character that is repeated 3 or more times, as then the front matter can be whatever fence is appropriate to the language:

https://github.com/balupton/website/blob/1362659ed339fb90bc16df128ada82c0faea126b/src/layouts/page.html.coffee#L1-L6

so a front matter using docmatter for a javascript file can be

/***
some: data
***/

alert('your js code')

however, for the feature request, whether it is three backticks (my preference for markdown for the reason above), or three dashes (--- the most common for frontmatters), it is irrelevant to this feature request of having the front matter parser specified in the fence information, so

using yaml by default

---
some: data
---

using yaml by explicit

--- yaml
some: data
---

using json by explicit

--- json
{ "some": "data" }
---

my end goal is merely to accomplish

https://raw.githubusercontent.com/balupton/website/2e9d2c131d4019126452b4314d926a0811c87aa9/pages/blog/contributor-friendly.mdx

with the standard mdx ecosystem, without having to use my own docmatter mdx loader which I do not have the resources to maintain

@balupton my point is for your actual use cases

https://github.com/balupton/website/blob/2e9d2c131d4019126452b4314d926a0811c87aa9/pages/blog/contributor-friendly.mdx
https://raw.githubusercontent.com/balupton/website/2e9d2c131d4019126452b4314d926a0811c87aa9/pages/blog/contributor-friendly.mdx
https://raw.githubusercontent.com/balupton/website/2e9d2c131d4019126452b4314d926a0811c87aa9/pages/blog/contributor-friendly.mdx

they are all code fences, you don't need frontmatter at all if that's the syntax you want to use, you can use the default code block parsing with a plugin like:

const visit = require("unist-util-visit-parents")
const is = require("unist-util-is")

function remarkPluginFenceAsFrontmatter() {
  return function firstCodeBlockAsMatter(tree) {
    visit(tree, 'code', function handleMatter(codeBlockNode, parentNode) {
      // only treat the first code block at the root level as a frontmatter
      if (is(parent, 'root')) {
        // we now know codeBlockNode can be treated as a frontmatter
        // codeBlockNode.lang will have language information
        // codeBlockNode.value is the inner content as a string
        return visit.EXIT
      }
    }
  }
}

Okay thank you. I'll look into it more when I'm ready to get back to development on that project. Thanks for all your help.

I guess this would be an alternative to https://github.com/remarkjs/remark-frontmatter/blob/9e3bcbd7b578857e225224dff35a47da745e23cd/index.js

With something like:

const visit = require("unist-util-visit-parents")
const is = require("unist-util-is")

function remarkPluginFenceAsFrontmatter() {
  const data = this.data()
  return function firstCodeBlockAsMatter(tree) {
    visit(tree, 'code', function handleMatter(codeBlockNode, parentNode) {
        // @todo make sure it is the first element in the document, not just the first code block
      if (is(parent, 'root')) {
        if ( codeBlockNode.lang.toLowerCase() === 'json' ) {
          Object.assign(data, JSON.parse(codeBlockNode.value))
        }
        else {
          throw new Error(`unsupported lang for frontmatter: ${codeBlockNode.lang}`)
        }
        return visit.EXIT
      }
    }
  }
}
remark().use(the above)

Something like that indeed!

  1. You don’t need to walk the whole tree, as the code is the first child of the tree, you can check for that:
    return function firstCodeBlockAsMatter(tree) {
      const head = tree.children[0]
      if (!head || head.type !== 'code') return
      ...
  2. Instead of throwing an error, maybe a warning? In any case, the second argument given to your transformer is file, where you can emit errors and warnings on, meant for stuff like this:
    file.message('Expected `"json"`, not `"' + head.lang + '"`, for frontmatter code', head)
  3. You can use the above messages to also signal when JSON is incorrect!
    let matter = {}
    try {
      matter = JSON.parse(head.value)
    } catch (error) {
      file.fail('Could not parse JSON: ' + error.message, head)
      // This also throws again, but attaches the error to the file.
    }
    // so this doesn’t run
  4. The data you’re using (data = this.data()) is data specifically for parsing and compiling multiple files. The frontmatter doesn’t really relate to that, but has to do with metadata about files. vfile is also made for that.
    Object.assign(file.data, JSON.parse(head.value))
    // OR
    file.data.matter = JSON.parse(head.value)
    (matter is what vfile-matter assigns to, and works with some plugins, such as rehype-meta)