delph-in/pydelphin

Quote predicates containing reserved characters in SimpleMRS

Closed this issue · 0 comments

From #371 (comment)

In SimpleMRS, when a string-pred containing reserved characters (whitespace, <, [, etc.) is read in, it will be serialized without quotes, leading to a form that can't be decoded again.

>>> m = simplemrs.decode('[RELS: < [ "foo bar" LBL: h0 ARG0: e2 ] >]')
>>> print(simplemrs.encode(m))
[ RELS: < [ foo bar LBL: h0 ARG0: e2 ] > ]
>>> simplemrs.decode(simplemrs.encode(m))
[...]
delphin.mrs._exceptions.MRSSyntaxError: 
    [ RELS: < [ foo bar LBL: h0 ARG0: e2 ] > ]
                    ^
MRSSyntaxError: expected: a feature

On serialization, the predicates should be checked for the presence of such characters. If the characters are present, the predicate should be quoted.