datalust/superpower

Annotate tokens with their positions from the source text

arseniiv opened this issue · 9 comments

In Sprache, you provided IPositionAware and Positioned to make a parse result, well, aware of the position it’s parsed. I see this feature useful for giving a precise position of some syntax construct in post-parse checks (like, “this variable right here wasn’t declared” vs. the same without being able to report a position to the user, so they would have to search that place for themselves).

There is Result<T>.Location, but I don’t see how I could apply that to the resulting value via combinators. Could I achive it here, and which way you’d advice to do it best? (Or if maybe I’m looking for the wrong thing, and the thing mentioned should be done another way.)

Hi!

There's no built-in combinator; I think it would be reasonably easy to write one, using similar tactics to Sprache's implementation - keen to explore how it might look.

If you want to drop this into your own project I think it's roughly:

interface ILocated
{
    TextSpan Location { get; set; }
}

static TextParser<T> WithLocation<T>(this TextParser<T> parser)
    where T: ILocated
{
    return i => {
        var inner = parser(i);
        if (!inner.HasValue) return inner;
        inner.Value.Location = inner.Location;
        return inner;
    };
}

(Sketched in browser, no idea whether or not this will compile as-is ;-))

HTH,
Nick

Ah, thank you! I’ll look at it and write back if there will be problems hard to fix. (Or if it goes smoothly, anyway.)

Hi again, I’ve tested this code, and it works like a charm!

…Almost. Length of all TextSpans returned seems always be the same (and to be the full length of the string parsed). Is it expected? I used Superpower 2.3.0 from NuGet, and here is my source and some examples.

I’m okay with having only start positions, though. Thanks once more!

Thanks for the follow-up! That's great.

I think the proper span length could be reported using something like:

    return i => {
        var inner = parser(i);
        if (!inner.HasValue) return inner;
        inner.Value.Location = inner.Location.Until(inner.Remainder);
        return inner;
    };

Let's leave this open as a nod towards implementing it within Superpower sometime in the future :-)

This modification works nicely. 🙂

I have been trying to get this to work on a TokenListParser instead of a TextParser but I can't figure out how to retrieve the Token source position from the parsed TokenListParserResult. I want to have similar behaviour to the Positioned() method from Sprache but my grammar is fully Tokenized at this stage. I have bellow my adaptation of the WithLocation method but for TokenListParser. Is this possible with the current interface of Superpower? Am I missing something?

 public static TokenListParser<TKind, T> WithLocation<TKind, T>(this TokenListParser<TKind, T> parser)  
            where T : ILocated  
        {  
            return i => {  
                var inner = parser(i);  
                if (!inner.HasValue) return inner;  
                inner.Value.Location =   //Can't figure out how to retrieve position information from inner
                return inner;  
            };  
        }  

Hi @JoeMGomes - unfortunately no time to dig in properly but hopefully this helps:

The start index of the match within the input will be inner.Location.First().Position.Absolute.

The exclusive end index will be inner.Remainder.First().Position.Absolute.

In the second case it's also possible that the remainder token list will be empty, which would mean "matched until end of stream".

The start index of the match within the input will be inner.Location.First().Position.Absolute.

This worked nicely for me! Thank you very much! Any reason for this not to be part of Superpower?

That's good to know @JoeMGomes 👍

Just design and implementation time constraints, currently, but it seems like a worthwhile inclusion for the future 👍