Revisit the concept of tight/loose list.
crlf0710 opened this issue · 30 comments
This is inherited issue from CommonMark: there's a concept of tight list, where each <li>
directly contains inlines (leaf block), vs the "loose" lists, where <li>
contains a paragraph that contains inlines (container blocks). In commonmark the loose lists has items separated by blank lines, while the tight lists does not. So this is the properly of the list itself.
But this design doesn't actually serve its purpose well. The end product is determining whether each list item is a leaf block or a container block. When a blockquote or a verbatim block is within a list item, it has to be a container, even within a tight list. At this time the html output will be half container and half leaf, e.g.
1. item one
> xxx
2. item two
outputs
...
<li>item one
<blockquote>
<p>xxx</p>
</blockquote>
</li>
...
This mix of inline and block is unnecessarily complex and not well supported in many output formats,
Also, this leads to the "last-minute loose" issue, where a blank line between the 99th and 100th list items can turn the whole list from tight to loose. All renderers has to wait until the last list item come up before it can determine the actual output of the very first list item.
I hope we can remove the concept of tight/loose list, and replace it with heuristics determining the container/leaf-ness of each list item with good locality, and simply allow in the same list, some list items are containers and some list items are leafs, but not both at the same time.
Tentative +1. I've tried to make tight/loose distinction work for my blog, but that was quite a bit of hassle for no real gain, so I render all lists the same. Basically, I am a confused user who walked away from the feature.
It does seem to me that delegating that to the writer's (heuristics) + a couple of classes to override
{.loose}
- foo
- bar
- baz
wold be better.
Oook, so after putting two and two together, it was this second that I've realized, Molière-style, that tight/loose list comes from markdown. I've been using markdown for I don't know how long, and it's the first time I realize that this feature exists. In retrospect, this explains why sometimes my lists look off in markdown (they accidentally loose)!
I now feel strongly that we should remove it )
+1, I feel most Markdown users don't know about / understand this syntax feature, and thus it leads to surprising results
The distinction doesn't have anything to do with leaf/container nodes. Note here that the AST is exactly the same for these two cases, except for the tight
attribute on the list.
- one
- two
doc
list list_style="-" tight="true"
list_item
para
str text="one"
list_item
para
str text="two"
references = {
}
footnotes = {
}
- one
- two
doc
list list_style="-" tight="false"
list_item
para
str text="one"
list_item
para
str text="two"
references = {
}
footnotes = {
}
The distinction relates only to how the list is formatted in the output format. (And an output format that doesn't support the distinction---some don't---could just ignore it.)
The way it's handled will vary from one output format to another. In HTML, it involves removing <p>
tags. In LaTeX, it involves setting some length parameters.
Of course, it might be possible to handle it a different way in HTML -- leaving the <p>
tags and adding a CSS class, then adjusting things with CSS. It would be fine for a renderer to do this instead of what the default HTML renderer does (which is similar to what Markdown.pl did).
The main question is whether there should be a mechanism to indicate this distinction in the source document. If the answer is yes, then the source readability desideratum strongly favors the way we currently indicate it (by looking for blank lines) over something like an explicit class .loose
. (Avoiding English-centric directives is another design desideratum.)
Of course, it might be possible to handle it a different way in HTML – leaving the
<p>
tags and adding a CSS class, then adjusting things with CSS. It would be fine for a renderer to do this instead of what the default HTML renderer does (which is similar to what Markdown.pl did).
My gut feeling is that relying on CSS to remove the (normal) effect of tags should not be the default; if one doesn’t want a tag to make a difference one shouldn’t include the tag at all. That is and must be the default assumption.
Inserting whitespace through CSS is another matter though. In CSS margins on nested elements, unlike padding, aren’t additive; rather the widest margin on any of the nested elements “wins”, so it would be “safe” to only wrap list item content in paragraphs when a list item actually contains multiple blocks and then use CSS like this:
p, ul.loose > li, ol.loose > li {
margin-bottom: 1em;
}
The actual whitespace seen will be only one em even when the last child of a <li>
is a <p>
.
I very much think that a feature shouldn’t be removed just because people can’t be bothered to read the manual. It can’t be to much to ask that users should read the syntax description, where the difference between tight and loose lists is described clearly enough. That said the loose-list feature is not without its problems, mainly that if any item in a list contains more than one block the whole list becomes loose, i.e. the content of every list item is wrapped in <p>
tags. That is frequently more than I for one want, so for that reason it might be better to let users explicitly request a loose list through a class or some less subtle syntax than the presence of blank lines within the list (though I have no idea what that syntax might be) letting renderers pick up an attribute on the list element object and implement looseness however suits the output format.
It's been a while since this has been discussed, so I'm not entirely sure where things currently stand, but I'll throw a thought out there anyway.
Currently, the djot syntax allows for multiple lists without any whitespace between them. So
- One
- Two
- Three
* A
* B
* C
would become two separate lists in HTML.
Would it not be more consistent with syntax to have a blank line between different lists? This could also let one of the bullet list indicators stand as a way of designating a loose item in a list. For example,
- Tight
+ Loose
- Tight
i. Tight (ordered)
+ Loose (ordered)
i. Tight (ordered)
The type of list should already be known (if I understand the AST correctly) and applied to the loose item. This doesn't work though if an entire ordered list should be loose. In that case, it might be better to have an ordered loose list item designated by i+.
or something like that. For example,
i+. Loose (ordered)
i. Tight (ordered)
i+. Loose (ordered)
The lack of alignment in the items isn't nice though, either. I suppose that could be fixed by using a slightly different syntax for ordered lists, like so:
+i. Loose (ordered)
-i. Tight (ordered)
+i. Loose (ordered)
Just an idea I wanted to throw out there.
Edit: Consistency in my examples.
I think you could just copy your proposal here and put a link to this issue on the discussion topic.
proposed principles
In my own plain text syntax work I'm at least 90% settled on the following principles:
- No such thing as a
compact
list, only whether a list can or should be rendered compactly - Whether a list can be rendered compactly is presentation form and style dependent
- a plain text format is itself a particular presentation form with its own needs, needs that should not impact renderings into other forms.
- Whether a list that can be rendered compactly should be rendered compactly is a stylistic choice
loose/compact as rendering specific decisions
It's easy for a renderer to determine whether list items in a particular presentation should or must be separated from each other for visual clarity. For example, a list item that contains multiple paragraphs should be separated by extra white space from adjacent list items to avoid misleading visual groupings -- but if the list items were rendered with alternating backgrounds then it might not be necessary. If the deciding factor is purely aesthetics, that too should happen in rendering specific decisions, e.g. CSS.
It's also important to recognize that plain text and graphical renderings have different presentational needs with regard to blank lines or vertical white space. Even different plain text syntaxes, such as Setext and ATX headings, have different needs. In Markdown,
- item one
- item two
## a heading
another block of
text
- item three
results in a tight list, while
- item one
- item two
a heading
---------
another block of
text
- item three
results in a loose one, but they both should look the same in HTML. Whatever the plain text syntax white space rules, they shouldn't dictate anything about renderings in other forms.
CommonMark had its hands tied because of Markdown precedent. There is no reason for djot to be encumbered by this.
how this would work for djot, expressed in djot
This list happens to be compact in djot syntax:
* It can be rendered in in HTML compactly (without
`p` tags) because:
* each item is a single chunk of text
* like in a table cell. There is no nesting
of blocks
* But a rendering can choose differently, either
in its output structure (e.g. `p` tags) or via
stylesheet.
* If a rendering decision *must* be made in the
plain text, it should be made with the standard
djot mechanism for this: *block attributes*.
See below.
This next list has blank lines only because djot
syntax requires it, not because a loose list was
desired:
* Any blank lines required by a plain text syntax
(as opposed to the text's instrinsic structure,
e.g. paragraphs) should have zero impact on any
renderings into other formats.
* I have a preceding blank line *not* because I
want to be loose, but for consistency with
djot's "Paragraphs can never be interrupted by
other block-level elements" rule.
* I have an internal blank line *not* because I
want to force a loose list, but because in djot
> Paragraphs can never be interrupted by
> other block-level elements
* A rendering, though, can make its own choices,
be it through the rendered structure or application
of stylesheet.
{.loose-list}
* If the rendering choice needs to be made in the
plain text, use a block attribute, which can
override the renderer's choice, whatever it is.
* A smart HTML renderer would render this list
compactly by default.
{.highlight}
* 🌶 Critically, the insertion of a block attribute
and its required preceding blank line has no
impact in and off itself.
* That would defeat the purpose!
* Neither the outer or inner lists are impacted by
the block attribute's presence, only its content.
This proposal not only separates content, syntax and presentation concerns, it disentangles djot syntax from the loose/compact list issue. Overloading blank lines that way (djot block element delimiter and loose list indicator) is simply a recipe for conundrums, paradoxes and befuddlement.
I think I like the idea that this is for the renderer to decide. But having a way to override it also seems potentially important, and there we get into the issue of English words again. Maybe it's okay if it's just a generic instructor for the renderer, and something that wouldn't need to be used much?
If the override is via a block attribute and a classname, doesn't that avoid introducing English into djot? Or is the concern that djot's built-in renderers will have hardcoded English classname dependencies?
Or is the concern that djot's built-in renderers will have hardcoded English classname dependencies?
Yes.
Leaving the renderer to decide an arbitrary attribute label for tightness/looseness seems like a good idea. Another option may be to indicate a tight list by using a hard linebreak:
- Hi
- I'm a
- loose list
- Hi,\
- Loose list,
I'm not a\
- loose list
But all in all, I think the simplicity of using an attribute is ideal. I really haven't seen many Markdown users utilize the distinction. If it's really needed, I believe granular control of spacing is possible through in-list attributes or using multiple lists.