jonschlinkert/gray-matter

Inconsistent yaml file reading behaviour

atruskie opened this issue · 4 comments

Problem

Given two variants of a file, which are each read with matter.read(), where the only difference is the YAML document header (---), the data is automatically loaded for the first file, and the second file is treated as text only.

Case A

---
# config-A.yml
author: "an author"
contact: "https://twitter.com/"
description: "Personal blog"
domain: "https://domain.com"
name: "Blog"
rootPath:
> matter.read('config-A.yml')
{ orig: '---\r\nauthor: "an author"\r\ncontact: "https://twitter.com/"\r\ndescription: "Personal blog"\r\ndomain: "https://domain.com"\r\nname: "Blog"\r\nrootPath:',
  data:
   { author: 'an author',
     contact: 'https://twitter.com/',
     description: 'Personal blog',
     domain: 'https://domain.com',
     name: 'Blog',
     rootPath: null },
  content: '',
  path: 'config-A.yml' }

Case B

# config-B.yml
author: "an author"
contact: "https://twitter.com/"
description: "Personal blog"
domain: "https://domain.com"
name: "Blog"
rootPath:
> matter.read('config-B.yml')
{ orig: 'author: "an author"\r\ncontact: "https://twitter.com/"\r\ndescription: "Personal blog"\r\ndomain: "https://domain.com"\r\nname: "Blog"\r\nrootPath:',
  data: {},
  content: 'author: "an author"\r\ncontact: "https://twitter.com/"\r\ndescription: "Personal blog"\r\ndomain: "https://domain.com"\r\nname: "Blog"\r\nrootPath:',
  path: 'config-B.yml' }

Expected behaviour:

Honestly, I was surprised to find out gray-matter loaded data only files at all. However, since that is a stated goal, I'd expect both config files to be loaded, since they are both valid YAML documents (the document header is optional (If I'm reading the spec right)).

Lastly, js-yaml has no problems parsing either file and returns identical objects.


Package version tested against: gray-matter@2.1.0

doowb commented

I think this is the correct behavior because gray-matter parses the front matter from a file, then converts it into a JSON data object. For this to work, there needs to be delimiters at the beginning of the file specifying that the front-matter exists. Without the delimiters, the file is treated as a normal text file.

If you just want to read in and parse yaml files, try read-data or read-yaml. read-data is nice because it can read in yaml or json files depending on the extension.

Yeah, to second what @doowb is saying, have a syntax error in your examples. The closing --- delimiter is missing.

@doowb - my problem here is that gray-matter processes a .yml file at all. I have some 'data'/configuration files for a simple static site that are sometimes parsed by gray-matter and sometimes not.

To be clear: these config files do not have frontmatter. They are simply YAML files.

Granted, I would not normally pass config files to gray-matter but that is a quirk of the metalsmith tool I haven't worked out yet.

However, since the gray-matter project explicitly states that it should:

Have no problem reading YAML files directly

(see README.md) then I can only presume reading a valid YAML file is a case that gray-matter would want to support.


@jonschlinkert - I do not agree that I have a syntax error in either example document. Please correct me if I'm wrong, but in a valid YAML document, both the document header (---) and the end of document footer (...) are optional. A YAML document is not closed by another --- delimiter, rather a --- delimiter starts a new document. I have validated both example files in YAML linters and parsed both with js-yaml - as far as I can tell they are valid files.


Given the closest thing I could find to a frontmatter standard was the jekyll page on the topic (http://jekyllrb.com/docs/frontmatter/), I'd suggest that either:

  • gray-matter should not parse YAML-only files
  • or gray-matter should require and end-of-frontmatter token (---) to successfully parse the metadata
  • EDIT: or that any valid YAML document should be able to be parsed

An interesting counterpoint that might be raised is that is that the end-of-front-matter token is the same as the YAML document token (---) and that because of that, a file with YAML front-matter could be parsed validly as two separated documents. However, this scenario assumes that:

  • All front-matter is YAML (which it isn't)
  • And the body of the document is also valid YAML (which in markdown, most certainly won't be)

Closing since I think gray-matter still handles this correctly. IMHO, if you know you are parsing plain yaml files directly, then - as you mention - you can use js-yaml to do so. If, however, you want to use gray-matter to do so, you can - with the only requirement being that you need to tell gray-matter what you expect by adding at least an opening delimiter. Otherwise, it's impossible to know if you want to parse the string without doing a try-catch, in which case we would never be sure if an error should actually be thrown since you might have passed a yaml string or a plain text string.