jgm/djot

Use `![alt](path)` syntax not only to include images, but also text files

AvtechScientific opened this issue · 21 comments

Use ![alt](path) syntax not only to include images, but also text files.

See: jgm/djot.js#85

"This commit enables the creation of structured documents of arbitrary complexity, contrary to the current state of affairs where only one-pagers are allowed. So now if one needs to write a book or a lengthy article all the content must be in one file. Working with big files is not convenient and may slow down/crash the editor besides being hard to read. File inclusion, implemented in jgm/djot.js#94, allows the author to subdivide the book/article into separate manageable chapters. This provides djot with the power similar to that of LaTeX and makes it positively distinct from all the Markdown-like tools."

And a ready to merge PR: jgm/djot.js#94

Sorry, but there's a discrepancy between this description and what the code does. I see some code related to footnotes, etc. So what does it mean? What are the implications for other Djot implementations?

Also, sorry, but this is bragging, and weird:

This provides djot with the power similar to that of LaTeX and makes it positively distinct from all the Markdown-like tools.

In my own workflows, I use a higher level master document file (above Djot etc.), and I have made printed books from front cover to back cover with them, assembling bits of Djot content (and also Markdown or other, when needed). I don't see anything being solved "positively" here.

In my own workflows, I use a higher level master document file (above Djot etc.), and I have made printed books from front cover to back cover with them, assembling bits of Djot content (and also Markdown or other, when needed). I don't see anything being solved "positively" here.

Using your own workflows can you handle this simple case of two included files:

File_1:
Is this **bold

File_2:
or not?**... hmm

Master_File:
![File_1](./File_1) ![File_2](./File_2)

If yes - how do you do it?

I certainly would forbid breaking semantic structures between different files. What your example above is even aiming at solving ? You do really expect someone to start a bold structure in a file... and end it in another, seriously?

Speaking of semantics, what becomes the "alt" content in your ![alt](somefile.djot) ?

Speaking of semantics, what becomes the "alt" content in your ![alt](somefile.djot) ?

Just to stay consistent with the image case - alt will be displayed if somefile.djot is not found.

I certainly would forbid breaking semantic structures between different files. What your example above is even aiming at solving ? You do really expect someone to start a bold structure in a file... and end it in another, seriously?

  1. Judging by the lack of a solution on your side I conclude that you have no solution.
  2. The above mentioned example was intended for demonstration purposes only. If you can't think of a practical example for this - I'll help you. Inclusion, i.a., enables templates. Let's imagine a document that consists of 2 constant sections - header and footer, and a variable middle section (e.g. a letter to company employees were the variable middle section consists of employee names being programmatically injected). Header might open an inline element (bold, italic or whatever) that footer will close. It's just one example. I think the real life can bring more.

Just to stay consistent with the image case - alt will be displayed if somefile.djot is not found.

How consistent? In pandoc-flavored markdown, what you call the "alt" text may be used as the figure legend if the image stands alone in a paragraph of its own (see the "implicit figure" option).
The Djot syntax is not very clear yet about this use case, and it does not have a general provision either for captioned images and figures -- for reference, see notably discussions #28 and #87, amongst other. But whether it eventually goes the same way as in pandoc (using the bracketed text as implicit figure) or via a generalization of the (currently table-only) ^ legend markup, the issue would remain the same.

Judging by the lack of a solution on your side I conclude that you have no solution.

Sympathetic. But I recognize I wouldn't have a solution for a non-issue edge case with no clear semantics defined ;)

You mention templating - I'm not even that sure it should be part of the document syntax... And real templating goes far beyond mere content injection.
(For the mere record, however, I already need and use some sort of custom templating logic too, see #238, in specialized Djot-based template files. I do think the real life can bring more, indeed.)

I think the association of this syntax ![alt](path) with images is so widespread that it does not worth the effort to make it different.

An option could just use other symbol like #

  1. #(table.csv), by default use the file extension to detect the file type
  2. #[csv](table.csv), file type can be overwritten like this and ignore file extension
  3. #[=](table.csv) |#[=csv](table.csv), indicate in the AST that the file should be included as raw content, the file type could be used to potentially add some kind of syntax highlight
  4. #(table.csv@12..34) include only a slice of the file
  5. #(../user/profile.djot#who-i-am) include a section of the file that can be addressed by references

The only issue, would be that this symbol #, is used for tagging in certain system, but generally tags don't use symbols like [, ], (, ), or need to start by a valid unicode_id_start character.

Your (4) goes far beyond the Djot input format markup and is perhaps possibly best left to renderers' interpretation... If instead of a CSV, your'd have (to stay on something similar) a speadsheet (ODT/OOXML), the "Sheet" name might be needed, etc. All of this is highly dependent on the source format, so a specification would have to be very explicit. (What's a slice of a CSV actually? Lines, columns, both? Etc.)

@Omikhleia you are right on this and it was mostly related with other comment related to this #199 (comment), but this can also solve using URIs #(table.csv?start_row=12&end_row=34) which let more freedom to the interpreter how to handle this

I think the association of this syntax ![alt](path) with images is so widespread that it does not worth the effort to make it different.

It would not be making it different, but would be generalizing it, with consistent transclusion semantics for any referenced media type:

  • [label](moon.jpg): link to image
  • ![alt](moon.jpg): embed image
  • [label](phases.csv): link to CSV
  • ![alt](phases.csv): embed CSV as table

@vassudanagunta I think that the problem with that is what would be the meaning of [alt]?, in images and links is used as label, but in a csv or djot file, etc? Also, how do you assign a file-type, just using the file extension is not enough because there are some file extension that clashes, and in other cases the file extension is not even present at all like an URL https://example.com/user/81a6bf136427d9e/raw/acde69adea3db6

other option will allow using attributes to sort this out

  • ![](https://example.com/user/81a6bf136427d9e/raw/acde69adea3db6){file_type=csv}
* `![](https://example.com/user/81a6bf136427d9e/raw/acde69adea3db6){file_type=csv}`

Another way perhaps, which remains compatible with the default/current syntax: !format[alt](url) where "format" is optional and "guessed" by the renderer (btw. the file extension is not the only way, one could do a file introspection, etc.). This would allow for !csv[alt](myfile.txt)

[alt] is alternate text, not a label, meant to be used instead of the referenced resource in a number of circumstances. This is current semantics.

The issue of unknown file types is a general problem not exclusive to markup languages (e.g. opening a file via your OS GUI, or HTTP GET). It should be solved the same way rather than introduce something new: use a file extension as best practice. Else check mime type. Else check magic byte. Else report error.

* `![](https://example.com/user/81a6bf136427d9e/raw/acde69adea3db6){file_type=csv}`

Another way perhaps, which remains compatible with the default/current syntax: !format[alt](url) where "format" is optional and "guessed" by the renderer (btw. the file extension is not the only way, one could do a file introspection, etc.). This would allow for !csv[alt](myfile.txt)

This !format[alt](uri) actually could work, and be implemented easily at the lexer level, where the format could be a valid Unicode identifier, it will match !{unicode_id}[ as token to start the file transclusion otherwise will just emit a text token.

But there are 2 types of files, transclusion

  1. include the file and render it (interpreted), example a csv file would end rendered as a table
  2. include the file content without interpretation, it could be the equivalent to write a code block with the content of the file.

So based on that and to maintain consistency, djot can potentially handle both cases and use a similar syntax to what is used currently to differentiate between a raw content and a code block

  1. !=[alt](uri), interpreted without a format
  2. ![alt](uri), include the file content without a format
  3. !=format[alt](uri), interpreted with a format
  4. !format[alt](uri), include the file content with a format, allowing syntax highlight

the problem is that this will change the current semantic of ![alt](uri), but an exception could be made to Uris that have an image file type to maintain some kind of backward compatibility

But there are 2 types of files, transclusion (...)

I am not so sure. Keeping on with your CSV example, it may be rendered as a table, or as a graph (I'm using this for pie charts for instance, in my own renderer, but other visualizations could be considered). This is very open and cannot be handled with a straight single = in the input syntax, so the best course of action might be to use a class (or key-value attr), e.g. ![Soccer games](soccer.csv){.piechart} and the renderer does its best to honor the class / key-value attr it knows, and to have a decent fallback otherwise...

Note that this is not "transclusion", per se, by the way.

EDIT: class vs. key-value vs. "semantic" tag, if the latter (#240) eventually makes it.

@Omikhleia you are right, the djot class system can be used for this, there are many ways to interpret a file