newsdev/archieml.org

Spec may incorrectly limit multiline escaping

Opened this issue · 1 comments

Hi there, I'm in the middle of implementing an ArchieML parser for C# from the ground up, using your tests and spec as my guide. I may have run into a contradiction regarding escaping of backslashes in multiline values. On one hand, the spec states:

To avoid as much processing as possible, leading backslashes should be removed only when the backslash is the first character of a line (but not a value's first line), and when the second character is any of the following: {, [, *, : or .

On the other hand, test multi_line.17.aml would fail if this were done. It asserts that a key-value line like \key:value should have its leading backslash stripped, but the spec wouldn't strip this since the second character is 'k'.

I'm going by the tests for now over the spec, and stripping initial backslashes from all internal multiline lines when it is at column 0 regardless of what follows it. Feedback appreciated. I may be wrong about something.

Hi @partlyhuman, thanks for pointing this out!

You're right that that's an inconsistent description; it's an artifact from a prior implementation of escaping. The current method is I think more straightforward, and seems to be what you've gone ahead with: the first leading backslash within a multiline value should be stripped, regardless of what follows it.

As with other aspects of the archieml syntax, whitespace shouldn't be significant to detecting the backslash. This means that spaces should be allowed, and preserved, at the beginning of escaped lines, as long as the first non-whitespace character on the line is a backslash:

mutlilinevalue: Two examples:
\key: value
  \key: value
:end

Would result in:

Two examples:
key: value
  key: value

Does that seem reasonable? I'll update the spec to be consistent with the existing parsers.