purescript-contrib/purescript-parsing

Implement `Compactable` and `Filterable` for `ParserT`?

Closed this issue · 9 comments

Yesterday i noticed that a Filterable instance could be useful for ParserT

Let's say you want to sometimes discard a parse based on some condition or predicate. For example, let's say you have a parseCSV :: Parser String (List (List String)), which parses a CSV input to a list of lines (lines being lists of strings, one string per value between commas or newlines). The first row of the CSV is usually just all the column names, so let's say i want to remove that line using tail :: forall a. List a -> Maybe (List a). I could then use filterMap :: forall a b f. Filterable f => (a -> Maybe b) -> f a -> f b to remove that like this: filterMap tail parseCSV. The result of this would be a parser that ignores the first line of the CSV, and only gives the rest. It would also fail if the CSV input only has one line. Additionally, a custom error message can be provided like this (i think): filterMap tail parseCSV <?> "CSV does not contain any data, only column names!"

The implementation would be similar to the Filterable instance for Maybe and Either, in the sense that it's only got one item to filter (as opposed to a list).

I could write up a PR for this, just wanted to hear some opinions first.

Tried for a bit to implement Compactable, but i'm not sure of the best way to do this as i'm not super familiar with ParserT, and i'm sure my implementation would be lacking or slightly incorrect regarding which error messages and positions to return in case of a parse that was "compacted" or filtered away.

Insight appreciated

Looked a bit more into it and i've now written a Compactable instance. if compact is called on a parser that returns Nothing, it will act as if that parser failed and reset the position and consumed flag to what they were before the parser was run, and throw the error "Parser returned Nothing". Similar behavior for separate. I also added a function: setConsumed :: forall s m. Monad m => Boolean -> ParserT s m Unit, because i needed it for the implementation and thought it could be useful elsewhere too. (Should i add an unconsume as well to complement consume?🙂)

If i were to implement a Filterable instance, and add some tests, is that a PR you would consider merging?

Hi @3ddyy! Thanks for opening the issue and providing a good description of how your Compactable and Filterable instances would work.

I'm a maintainer of this library, but I'm unfamiliar with Compactable and Filterable so it's difficult for me to judge this issue. I'll need to defer to other maintainers about the addition.

That said, as we're a bit thinly-spread across a number of libraries it's usually best to open the PR (if it isn't too much additional effort) so that the other maintainers have something more concrete to look at and there is less action to take to get it merged to the library once it's confirmed as a reasonable addition.

From your description so far there doesn't seem to be any harm in adding these instances especially if you're able to keep reasonable positions and consumed flags, so I would encourage you to open a pull request.

However, if you would prefer for another maintainer to weigh in who can give you more concrete guidance then feel free to do so. Thanks!

I've already written the Filterable instance, so making a PR shouldn't be too difficult. I'll familiarize myself with the test library and write some tests either tomorrow or some other time soon, and submit the PR then 👍

garyb commented

What library provides the Filterable and Compactable classes? We try to keep dependencies in -contrib from reaching any further out than -contrib as there are times where updating things has been blocked on dependencies we have no control over.

garyb commented

Ah ok, I had an idea it was Liam's, but Pursuit wasn't finding it. I'm fine with that since he's a core maintainer.

Let's say you want to sometimes discard a parse based on some condition or predicate. For example, let's say you have a parseCSV :: Parser String (List (List String)), which parses a CSV input to a list of lines (lines being lists of strings, one string per value between commas or newlines). The first row of the CSV is usually just all the column names, so let's say i want to remove that line using tail :: forall a. List a -> Maybe (List a). I could then use filterMap :: forall a b f. Filterable f => (a -> Maybe b) -> f a -> f b to remove that like this: filterMap tail parseCSV. The result of this would be a parser that ignores the first line of the CSV, and only gives the rest. It would also fail if the CSV input only has one line. Additionally, a custom error message can be provided like this (i think): filterMap tail parseCSV <?> "CSV does not contain any data, only column names!"

Here’s the solution for this in terms of liftMaybe from #212 .

runParser csvinput do
  datarows :: List (List String) <- tryRethrow do
    allrows <- parseCSV
    liftMaybe (\_ -> "CSV does not contain any data, only column names!") $ tail allrows

If the “CSV contains some data” condition is not met, then the error position is at the beginning of the CSV, because of the tryRethrow.

This solution to the example seems pretty simple and general to me, so I think I'm going to say that #212 solves this issue without adding any new typeclasses to ParserT. I’d be happy to discuss further of course.