mrkkrp/forma

Parser composition

mulderr opened this issue · 6 comments

First of I really like forma so far but today stumbled upon a use case I'm not sure how to approach.

Example structure we want to parse (suppose a and b are much more complicated):

{
  "a": { "afield": "foo" },
  "b": { "bfield": "bar" }
}

We would like to be able to write a parser for a and b separately and then compose using subParser:

data Prod = Prod A B
data A = A Text
data B = B Text

type ProdFields = '["a", "b"]
type AFields = '["afield"]
type BFields = '["bfield"]

aForm :: Monad m => FormParser AFields Text m A
aForm = A <$> field' #afield

bForm :: Monad m => FormParser BFields Text m B
bForm = B <$> field' #bfield

-- uh oh, parsers are parametrized by different [Symbol] :(
prodForm :: Monad m => FormParser ProdFields Text m Prod
prodForm = Prod
  <$> subParser #a aForm
  <*> subParser #b bForm

We could just define a single type AllFields = '["a", "b", "afield", "bfield"] but that would be less precise, less safe and would still not compose nicely with other code.

Am I correct that right now the only way is to run the subParsers separately and combine the results after?

Tried the following workaround but it's rather unsatisfactory:

-- (++) at type level, should probably do a union instead
type family ((a :: [k]) :++: (b :: [k])) :: [k]
type instance '[] :++: xs = xs
type instance (a ': as) :++: bs = a ': (as :++: bs)

-- TODO: ensure n1 is a subset of n2
coerceParser :: FormParser n1 e m a -> FormParser n2 e m a
coerceParser = unsafeCoerce

prodForm :: Monad m => FormParser (ProdFields :++: AFields :++: BFields) Text m Prod
prodForm = Prod
  <$> subParser #a (coerceParser aForm)
  <*> subParser #b (coerceParser bForm)

At this point I'm thinking whether it would be possible to provide alternative versions of things that drop the [Symbol] guarantees and maybe just allow you to use Text? Personally, I tend to use the labels only once anyway so whether I make a mistake in the type definition or in the parser definition seems irrelevant :)

Exposing the inner bits of forma via an Internal module would also work without any code changes. You could then have an orphan instance IsString (FieldName names) or use FormParser '[] if you really wanted.

I think the problem is that subParser's type is not quite right, it should allow you to extend the set of sub-parsers's names. This could be captured nicely with dependent types.

Practically speaking, I agree that this provides little benefit and would be in favor of dropping the [Symbol] altogether and just use Text for field names as suggested.

Don't know about dropping [Symbol] altogether. Seems radical.

I created a fork that exposes an Internal module. This allows me to solve my immediate problem in application code. @mrkkrp let me know if that's something you would consider merging.

This is a valid concern, and it would be cool to make it work better. The way it works right now there is only one level, I'm afraid, where all labels should be combined together. I do not see an easy way to get it work otherwise.

Why putting all fields together is such a big issue for you @mulderr ? Evidently, it is not nice but perhaps it is still better than relaxing the guarantees of the library?

Two reasons:

  1. Arguably overloaded labels come at some cost and this way some of the guarantees are lost right from the start. Yes, I won't make a typo, but I may use a label I shouldn't be able to use in the first place. I could use #bfield inside a parser for A and it would compile just fine.

  2. More importantly, separation of concerns. I could have separate modules for A, B and Prod. It would be nice to be able to define A together with labels (easier to keep them in sync) and a parser for A without having to think that it may later be used inside Prod. With a single alias I need to decide up front what will compose with what and share the alias definition across modules.

I will admit these aren't very serious but in practice they do cause me some grief when I'd want to make things reusable.

Now, I could have a module somewhere that defines one giant alias for everything but that doesn't sound like a very fun thing to maintain :) Plus point 1. above.

These are not enough to warrant huge changes to the library for sure. Neither do I like the idea of going with only Text for everything but I would love an option to do so if I accept the risks. Right now I'd just ask the liberty to shoot myself in the foot by using Internal.

I've come to the conclusion that while forma has it's "sweet spot" it's is just not a good fit for what I'm trying to do. I should probably use aeson directly.

@mrkkrp thanks clearing things up, this was very helpful nevertheless!