commonmark/commonmark-spec

Proper html structure

4e576rt8uh9ij9okp opened this issue · 3 comments

Many JavaScript projects refer to this module as a specification for their projects.

Is there any reason why the H1 header isn't inside an article element and H2 not being inside a section element?

There is no language in the HTML spec which defines that article, section, are what makes documents "proper", or that headings need to be nested in these elements.
Rather, article and section have certain semantics according to the HTML spec, which do not overlap 100% with what h1 and h2 mean, meaning that such automatic changes would result in incorrect semantics.

My understanding is that the HTML output given in the spec were intended as non-normative examples. The spec is supposed to be about the semantics of CommonMark, and any rendering consistent with those semantics is conforming. Since CommonMark has neither syntax nor semantics corresponding to structural elements such as HTML article or section, it takes no position on their use. A renderer could do as you suggest and still be conforming.

Unfortunately, either because this point is not made forcefully enough in the spec, or because of marketing desires on the part of tool creators to claim "100% compatibility with the spec" and the simplest way to prove that being producing output that is exactly the same as the spec examples (i.e. treating them as normative) and exactly the same as the CommonMark reference implementations (which use the examples for its own conformance test suite), the result is the common (mis)interpretation that the HTML given in the spec is normative.

If you search the spec for the words "conform(ing)", "render" and "abstract syntax tree" you'll find evidence of this intent. A couple of examples:

Though this spec is concerned with parsing, not rendering, it is recommended that in rendering to HTML, only the plain string content of the image description be used. Note that in the above example, the alt attribute’s value is foo bar, not foo [bar](/url) or foo <a href="/url">bar</a>. Only the plain string content is rendered, without formatting.

this spec does not mandate any particular treatment of the info string.

(i.e. Conforming parsers are free to render code blocks with syntax highlighting.)

An alternative approach, "to use an abstract representation of the syntax tree instead of HTML" as mentioned in the above link, might have helped avoid this misunderstanding, and would have aided the development of conforming parsers without inadvertently seeming to mandate a particular rendering.

This is my take. @jgm, am I off base?

I did some better research and there is not a clear use of the article or section element.
Some Devs say that articles can be used as comments with footers and others say that each article should have a header and footer. Somewhere else it says that the article should have sections inside but they also say that articles can be within articles and sections should be used in the layout of the website.