Applicative parsing

Question

Applicative parsing

m4rw3r opened this issue 9 years ago · 0 comments

Problem

Some somewhat common constructions using monadic composition are more complex than they have to be. Applicative solves that by being less powerful and lacking context sensitivity (commonly):

do
    a <- digitP
    b <- digitP
    c <- digitP
    d <- digit
    return $ IP a b c d
  where digitP = digit >>= \x -> skip '.' >>= \_ -> return x

-- vs

pure IP <$> digitP <*> digitP <*> digitP <*> digitP <*> digit
  where digitP = digit <* (skip '.')

The default implementation of Applicative for a Monad also works for Chomp:

instance Applicative X where
    pure    = return
    d <*> e = do
      b <- d
      a <- e
      return (b a)
    (*>)    = (>>)
    x <* y  = x >>= \a -> y >> return a

But the issue here is that to be able to write things like

i.ret(IP).apply(digitP).apply(digitP).apply(digitP).apply(digit)

IP must be a function which is partially applied and is invoked like IP(192)(168)(0)(23) which is both very strange to use in Rust (which does not support partial application natively) and very slow since it forces boxed closures to be used to store the intermediate state.

Implementation

pure

Same implementation as Input::ret, use Input::ret instead of pure.
<*>

Problematic due to evaluation order and no partial application, see [Implementation] section.
*>

Does not need any implementation, just use then.
<*

Name skip, simple and no special considerations: self.bind(|i, t| rhs(i).map(|_| t))

Input::ret is kept as is to provide a way to lift values into the Applicative. then is also left as is as a method on ParseResult. skip is added to impl ParseResult since it is very useful even in monadic chaining.

apply is either implemented on a trait in a separate module or directly on the ParseResult.

Variants

There are multiple ways of implementing apply

Following Haskell's default Applicative

i.ret(capitalize).apply(any)

impl ParseResult<T, E> {
    pub fn apply<F, U, V, W>(self, F) -> ParseResult<W, V>
      where T: FnOnce(U) -> W,
            F: FnOnce(Input) -> ParseResult<U, V>,
            V: From<E>;
}

Pros

Simple
Follows a direct translation of Applicative laws
Does not consider values and functions differently

Cons

Only allows for 1-arity functions
Requires partial application to compose well
Requires utility functions or closures adapting the results if functions of n-arity are to be used
Unstable features are used if boxing of functions are needed

Monad laws

// Identity:
lhs_m.ret(|x| x).apply(f) === f(rhs_m)
// Composition:
lhs_m.ret(compose).apply(u).apply(v).apply(w) === u(rhs_m).apply(|i| v(i).apply(w))
// Homomorphism:
lhs_m.ret(f).apply(|i| i.ret(x)) === rhs_m.ret(f(x))
// Interchange:
u(lhs_m).apply(|i| i.ret(y)) === rhs_m.ret(apply_arg(y)).apply(u)

Tuple as an argument

fn digitP(i: Input<u8>) -> U8Result<u8> { digit(i).skip(|i| token(i, b'.')) }
i.ret(IP).apply((digitP, digitP, digitP, digit))

trait ApplyArgs<F> {
    type Output;
    type Error;
    fn apply(self, Input, F) -> ParseResult<Self::Output, Self::Error>;
}

impl<A, AT, AE, T, F> ApplyArgs<F> for A
  where F: FnOnce(AT) -> T,
          A: FnOnce(Input) -> ParseResult<AT, AE> {
    type Output = T;
    type Error = AE;

    fn apply(self, i: Input, f: F) -> ParseResult<Self::Output, Self::Error> {
        self(i).map(f)
    }
}

// Impls for tuples, eg. (A, B) where F: FnOnce(A, B) -> Out, A: FnOnce(Input) -> ParseResult<A, AE>, ...

impl ParseResult<T, E> {
    pub fn apply<A>(self, rhs: A) -> ParseResult<A::Output, A::Error>
      where A: ApplyArgs<T>,
             A::Error: From<E> {
        self.bind(|i, f| rhs.apply(i, f))
    }
}

Pros

Allows for arbitrary number of parameters
Compact syntax in usage
Does not treat functions and values differently
Applicative laws still apply if translated to tuples for multiple arguments

Cons

Fails to properly infer for closures in tuples like (|i| i.ret("foobar"), ...) (It cannot determine that i is Input) since it is using several traits, struct ParseResult -> trait ApplyArgs -> trait FnOnce -> concrete closure
Not very ergonomic at times since types have to be declared for all parameters.
Not possible to stack any values conditionally, the specific expression has to be known at compile time
Not as composable as the default since the tuples do not lend themselves well to partial application (which is kind of necessary to be able to express Applicative laws in a simple manner).
Horrible type impls for the tuples

Applicative laws

// Identity:
lhs_m.ret(|x| x).apply(f) === f(rhs_m)
// Composition:
lhs_m.ret(compose).apply((u, v)).apply(w) === u(rhs_m).apply(|i| v(i).apply(w))
// Homomorphism:
lhs_m.ret(f).apply(|i| i.ret(x)) === rhs_m.ret(f(x))
// Interchange:
u(lhs_m).apply(|i| i.ret(y)) === rhs_m.ret(apply_arg(y)).apply(u)

Stacking tuples

The idea here is to stack values into a tuple and then invoke a function with the given values:

i.ap(digitP).ap(digitP).ap(digitP).ap(digit).apply(IP)

Pros

Allows for somewhat conditional stacking of values
Follows Applicative laws if rewritten in Reverse Polish Notation with the stack-operator ´ap`

Cons

Treats values and functions differently
Treats tuples and non-tuples differently (cannot make a blanket impl for any T since it will conflict with all the impls for tuples).
Does not allow a smooth integration between monadic composition and applicative composition

Applicative laws

This version reverses the order of the apply operator arguments while still keeping the left to right ordering of the arguments supplied to ap:

// Identity:
lhs_m.ap(f).apply(|x| x) === f(rhs_m)
// Composition:
lhs_m.ap(w).apply(|i| i.ap(u).ap(v).apply(compose)) === rhs_m.ap(w).apply(v).apply(u)
// Homomorphism:
i.ap(|i| i.ret(x)).apply(f) === rhs_m.ret(f(x))
// Interchange:
i.ap(|i| i.ret(y)).apply(u) === i.ap(|i| i.ret(u)).apply(apply_arg(y))

Note that it does not play well with existing values in the Monad/Applicative since they are of type T which is not a tuple while the Applicative functions (ap and apply) only work on tuples in the applicative. IE. Input + ParseResult<T, E> is always a Monad, but only an Applicative if T is a tuple of some kind.