mbakeranalecta/sam

HTML output mode

Closed this issue · 2 comments

It might be desirable to support an HTML output mode. In this mode, the parser would output an HTML document directly, translating the semantic structure of the SAM document into class information in the HTML so that CSS could be used to format it.

Obviously this would not provide sufficient processing of all potential SAM document semantics, but it would allow for a schema-controlled semantically authored means to confine an HTML document to the expectations of a specific CSS stylesheet, which seems like a valuable thing.

The basics of this seem pretty simple. SAM already has the basic text structures of an HTML document: paragraphs, lists, simple tables. It also has block and phrase structures that correspond to HTML div and span elements. The name of a block can become the class of a div and the type of an annotation can become the class of a phrase.

Would have to go through the complete list of SAM structures in detail to come up with equivalents for each, and some might have to be dropped from the output, but in principle it should be straightforward enough.

HTML output mode implemented as of 3e9b8f6. Still needs to be documented.

Docs added in a2290ac.