Raku/RakuDoc-GAMMA

forbid or not forbid? embedded codes embedded in code

finanalyst opened this issue · 12 comments

@thoughtstream @lizmat
The current document rakudoc_draft_3.rakudoc is not compiling with

RAKUDO_RAKUAST=1 raku -c rakudoc_draft_3.rakudoc

Finding the problems is tedious. I found a few that have obvious solutions, which @lizmat implemented amazingly quickly.

I'm calling this a compilation error because I'm using the -c flag, but I suppose pedantically, it should be a parsing error.

Here's one without an obvious solution. Consider the following section of text:

Alias placement codes may also specify a default display text, before the
alias name and separated from it by a C<|>. When a display is specified,
it will be used if the requested alias cannot be found (and an I<"unknown alias">
warning will be issued in that case):

=begin code :lang<RakuDoc> :allow<B>
    The use of A<B<this program |> PROGNAME> is subject to the terms and conditions
    laid out by A<B<our company |> VENDOR>, as specified at:

        A<B<(Please visit our website) |> TERMS_URLS>
=end code

The superficial component, viz, A<display text|ALIAS> parses without a problem as specified.

However, after discussion in another issue here and between Liz and I, we came to the conclusion that the render is responsible for metadata options, not the parser, for two reasons:

  1. The rendering of eg B<> will depend on the output format, which the parser cannot know about, but the renderer does.
  2. A config directive, such as =config code :allow<B>, would need to be tracked by the parser as well. Since the renderer has to track config directives, and the directives only involve metadata, it seems clear to leave metadata to the renderer.

This decision means that within a =begin/=end code block, the parser must parse all markup codes, leaving it to the renderer to follow the :allow

But A<> markup may not contain embedded markup within the ALIAS part.

These two rules in combination lead to the parser failing at

The use of A<B<this program |> PROGNAME> is

because the parser considers B<this program |> to be a part of the ALIAS.

I thought a workaround would be to use V<> markup but the following does not work either and for a similar reason, to allow for a =config V :allow< B > to work.

The use of V«A<»B<this program |> V«PROGNAME>» is subject to the terms and conditions

Unrelated to the above issue, the 'workaround' generates the far more useful error

===SORRY!===
RakuDoc markup code A missing endtag '>'.

than the error of the original example (which is why it has been difficult to find the reasons for the compilation errors)

===SORRY!===
This type cannot unbox to a native string: P6opaque, RakuAST::Doc::Markup

@thoughtstream @lizmat thoughts?

@finanalyst: if you get an execution error, pasting the stacktrace that is generated with --ll-exception is way more useful :-)

Fixed the error This type cannot unbox to a native string: P6opaque, RakuAST::Doc::Markup with rakudo/rakudo@a51ee10dbf

The fix generates AST; now I have to work out how to use it.

However ...
We also have the following:

=begin code :allow<B I V>
V« B<» I<DISPLAY-TEXT> V« >»
V« C<» I<DISPLAY-TEXT> V« >»
V« H<» I<DISPLAY-TEXT> V« >»
and all the other codes
=end code

which all fail at parsing with the error shown above (missing >)

I'd say the error is correct: in "V« B<» I V« >»" is balanced and hides the closing > of B<

The fix is quite easy - remove the V<> markups. I don't think they add much.

But this does show that we have two conflicting rules:

  1. Everything inside a V / C must be provided as is
  2. Allow a =config V ... (eg =config V allow< I > to override the default behaviour.

If we provide the flexibility of a config for markup codes, then the parser has to parse the inner text.

But that's what causing the error: the parser is parsing the inner text. So if you improperly balance markup markers, you get an error. Or am I missing something?

That is my point.
If - as in RakuDoc v1 - there is no config for markup, then there would be no need to parse the inside of C or V markup; just return the inner text as a string without parsing it.
BUT if - as in RakuDoc v2 - there is config potential for markup - which I think is a good thing, then we must require parsing the inner text. And that leads to an error.

I suppose a way around this is to say that if the parser fails to find a balancing markup marker, when inside another markup, then it returns the inner text as a string, and does not issue an error.

I suppose a way around this is to say that if the parser fails to find a balancing markup marker, when inside another markup, then it returns the inner text as a string, and does not issue an error.

I seem to remember that the use of markup markers is ok in strings, but they need to be balanced. In the example given. they are not balanced, hence the error. A case of DIHWIDT!

Suppose that you do want to say something like B< in a string without intending it to be a markup code? That would seem to be impossible.

Just thought

BV«<» I<DISPLAY-TEXT> V«>»

But not

V«B<» I<DISPLAY-TEXT> V«>»

So all good I think