typelevel/cats-parse

Inconvenient Parser0 type derivation

Closed this issue · 4 comments

Hi everyone. I am trying to port some Ruby example to cats-parse and having troubles with Parser0. It seems to be wrong somewhere.

Here is the example:

case class Term(value: String)
case class Operator(value: String)
case class Clause(op: Option[Operator], term: Term)

case class Query(value: List[Clause])

val term = alpha.rep.string.map(Term.apply)

val space = sp.rep

val operator = (pchar('+') | pchar('-')).string.map(Operator.apply)

// Parser0 ???
val clause = (operator.? ~ term).map { case (op, t) => Clause(op, t) }

// no rep in Parser0
//val query = (clause <* space.?).rep.map(t => Query.apply(t.toList))

So the main problem is in clause parser. Why does it became Parser0? The term is not term.?, it must always be in query. Which makes it impossible to use rep method.

I think the concept missing here is .with1 which allows you to compose a Parser0 and a Parser to make a Parser.

So, change:

val clause: Parser[Clause] = (operator.?.with1 ~ term).map { case (op, t) => Clause(op, t) }

The idea is: if you may parse nothing, then definitely parse something, the whole thing will definitely parse something. The reverse order (Parser then Parser0) doesn't require any with1.

It would be possible to make the code more generic using a typeclass to compute the return type based on any pair of {Parser0, Parser} but that would make the code and documentation harder to read.

now you should be set.

PS: I would welcome any PRs to summarize some of the things you've learned in an FAQ section of the README

But what if i want to do something like

(Parser0[Unit] *> Parser0[String]).rep ?

This happens with me often because i have a lot of "OR" clauses where Parser0 or Parser should be chosen. Is looks like it could be simplified somehow to Parser0[NonEmptyList[String]]...

That parser, one that could parse nothing repeated, can return an infinitely long list of empty strings.

If we provided this method, it would compile but blow up at runtime with OOM.

If you want this I think it means you need to find a way to restructure the parser.

Ok, thank you. I'll try to contribute to FAQ when i get some free time.