pointfreeco/swift-parsing

Whitespace ergonomic improvements would be nice

haikusw opened this issue · 1 comments

Having to parse out whitespace is a super common thing to need to do when parsing.
In nearly all cases you want to skip the whitespace.

The current Whitespace parser only works on utf8 characters and not on Substring so one has to use FromUTF8View to work in mixed contexts.

Since nearly always wants to skip whitespace, it's tempting to try and write something like this:

let skippedWhitespace = Skip { Whitespace() }

but that requires using the FromUTF8View in the oft-used Substring context:

var testMany = Parse() {
    Many {
        test
    } separator: {
        FromUTF8View {            // do not want this here but compiler requires it.
            skippedWhitespace	// want this to be a single line that skips whitespace without compiler error
        }
    }
}

one can write the skippedWhitespace as follows to get the desired integrated hoisting from UTF8 space:

let skippedWhitespaceFromUTF8View = Skip<FromUTF8View<Substring, Whitespace<Substring.UTF8View>>> { FromUTF8View { Whitespace() } }

but that's not exactly intuitive to come up with.

Improving this would be a welcome addition to the library.

One commentator mentioned that a whitespace parser for Substrings might need to include more characters in it's definition of whitespace. The current Whitespace parser is actually only an ASCII whitespace parser and so may want to be renamed.

A more general whitespace parser would thus likely be based on Prefix and allow passing an optional character set or Substring with the whitespace characters to include in the definition of "whitespace".

Related: Newlines are also a common parsing token to be dealt with and they should be easy to deal with at all levels of collection inspection (Substring, UTF8View).

Personally, I've just been using Prefix(while: \.isWhitespace).