jgm/commonmark-hs

Fenced div class parser is too strict

Opened this issue · 5 comments

srid commented

In TailwindCSS, we have CSS class names like w-1/2 which trip up the fenced div parser.

Example,

:::{.w-1/2}
Hello world
:::

A potential fix is to add / to the list here (any reason not to do that?)

<|> satisfyTok (\c -> hasType (Symbol '-') c || hasType (Symbol '_') c)

Tailwind classes can also contain a hyphen (eg: md:w-auto), so we might as well add :.

@jgm
I just created a PR for it

jgm commented

Can you link to documentation proving that / is allowed?
https://www.w3.org/TR/CSS21/grammar.html#scanner
suggests it isn't (without escaping anyway). Note

class
  : '.' IDENT

ident		-?{nmstart}{nmchar}*
nmstart		[_a-z]|{nonascii}|{escape}
nmchar		[_a-z0-9-]|{nonascii}|{escape}
srid commented

@jgm While it is true that the CSS spec prevents use of these characters ... why should the commonmark parser assume that it is to always generate strict CSS classes? Because, there are legitimate use cases where one would want to have it accept non-standard CSS classes, to be in turn run through CSS preprocessors (that do use these characters, with special semantics, 'compiling' them to valid CSS classes in the final HTML) like Tailwind.

I use commonmark-hs with the fenced_divs extension in Emanote, to apply Tailwind-based styling to arbitrary blocks in Markdown. For eg., see here https://note.ema.srid.ca/demo/custom-style#advanced-styling - where multiple embedded notes (themselves divs) are arranged in grid fashion using Tailwind's classes. Their source Markdown contains,

:::{.grid .grid-cols-2 .grid-flow-row .gap-0 .p-3 .bg-gray-500}
![[examples]]

![[start]]

![[file-links]]
:::

The problem here is that I'm unable to use the responsive variant classes like md:grid-cols-1 (to make the grid a 'one column' grid on mobile devices), because the commonmark parser is being unaccepting of classes outside of the CSS spec.


As an aside, Tailwind is an interesting library to consider here because it enables "composition" of CSS styles via nothing but CSS classes ... which lends quite well to the fenced_divs extension inasmuch as it allows providing a list of classes (that are composed by way of concatenation, which is how Tailwind composes).

jgm commented

Tailwind has the following example:

<img class="w-16 md:w-32 lg:w-48" src="...">

Is the idea that tailwind compiles this HTML down to HTML that contains only valid class names?

Note: you can already do this:

% commonmark --json -xall
::: {class="tail/wind"}
hi
:::
{"pandoc-api-version":[1,22],"meta":{},"blocks":[{"t":"Div","c":[["",["tail/wind"],[]],[{"t":"Para","c":[{"t":"Str","c":"hi"}]}]]}]}
srid commented

Is the idea that tailwind compiles this HTML down to HTML that contains only valid class names?

So, there are more than one compiler actually. The answer is "no" to case 1, and "yes" to case 2.

  1. Tailwind JIT compiler: the 'special' CSS classes remain in the final HTML, but Tailwind generates a .css file with the class definition escaped. For example, this contains the definition .sm\:top-6{top:1.5rem}. It looks like, here they are relying on the browser being more lenient when it comes to parsing CSS classes used in the HTML tag attributes.
  2. WindiCSS compiler's compile mode (--compile), which squashes "w-16 md:w-32 lg:w-48" into a single randomly named class (that is valid CSS class name, like windi-15wa4me, as you can see further down that link), by rewriting the HTML along with generating the final CSS file (with the randomly named class definitions).

(There is also JS shim, which is what Emanote currently uses; but it will be moving towards the model 2 above).

{class="tail/win}

Ah, I didn't know about this syntax. That actually would work in our case, though it would be nice to be able to do things like this (using #75):

:::w-1/2
Half width paragraph
:::

But given that there is an alternative (slightly more verbose) syntax that does work, it now becomes super less important, I guess.