m4rw3r/chomp

Applicative parsing

m4rw3r opened this issue · 0 comments

Problem

Some somewhat common constructions using monadic composition are more complex than they have to be. Applicative solves that by being less powerful and lacking context sensitivity (commonly):

do
    a <- digitP
    b <- digitP
    c <- digitP
    d <- digit
    return $ IP a b c d
  where digitP = digit >>= \x -> skip '.' >>= \_ -> return x

-- vs

pure IP <$> digitP <*> digitP <*> digitP <*> digitP <*> digit
  where digitP = digit <* (skip '.')

The default implementation of Applicative for a Monad also works for Chomp:

instance Applicative X where
    pure    = return
    d <*> e = do
      b <- d
      a <- e
      return (b a)
    (*>)    = (>>)
    x <* y  = x >>= \a -> y >> return a

But the issue here is that to be able to write things like

i.ret(IP).apply(digitP).apply(digitP).apply(digitP).apply(digit)

IP must be a function which is partially applied and is invoked like IP(192)(168)(0)(23) which is both very strange to use in Rust (which does not support partial application natively) and very slow since it forces boxed closures to be used to store the intermediate state.

Implementation

  • pure

    Same implementation as Input::ret, use Input::ret instead of pure.

  • <*>

    Problematic due to evaluation order and no partial application, see [Implementation] section.

  • *>

    Does not need any implementation, just use then.

  • <*

    Name skip, simple and no special considerations: self.bind(|i, t| rhs(i).map(|_| t))

Input::ret is kept as is to provide a way to lift values into the Applicative. then is also left as is as a method on ParseResult. skip is added to impl ParseResult since it is very useful even in monadic chaining.

apply is either implemented on a trait in a separate module or directly on the ParseResult.

Variants

There are multiple ways of implementing apply

Following Haskell's default Applicative

i.ret(capitalize).apply(any)
impl ParseResult<T, E> {
    pub fn apply<F, U, V, W>(self, F) -> ParseResult<W, V>
      where T: FnOnce(U) -> W,
            F: FnOnce(Input) -> ParseResult<U, V>,
            V: From<E>;
}

Pros

  • Simple
  • Follows a direct translation of Applicative laws
  • Does not consider values and functions differently

Cons

  • Only allows for 1-arity functions
  • Requires partial application to compose well
  • Requires utility functions or closures adapting the results if functions of n-arity are to be used
  • Unstable features are used if boxing of functions are needed

Monad laws

// Identity:
lhs_m.ret(|x| x).apply(f) === f(rhs_m)
// Composition:
lhs_m.ret(compose).apply(u).apply(v).apply(w) === u(rhs_m).apply(|i| v(i).apply(w))
// Homomorphism:
lhs_m.ret(f).apply(|i| i.ret(x)) === rhs_m.ret(f(x))
// Interchange:
u(lhs_m).apply(|i| i.ret(y)) === rhs_m.ret(apply_arg(y)).apply(u)

Tuple as an argument

fn digitP(i: Input<u8>) -> U8Result<u8> { digit(i).skip(|i| token(i, b'.')) }
i.ret(IP).apply((digitP, digitP, digitP, digit))
trait ApplyArgs<F> {
    type Output;
    type Error;
    fn apply(self, Input, F) -> ParseResult<Self::Output, Self::Error>;
}

impl<A, AT, AE, T, F> ApplyArgs<F> for A
  where F: FnOnce(AT) -> T,
          A: FnOnce(Input) -> ParseResult<AT, AE> {
    type Output = T;
    type Error = AE;

    fn apply(self, i: Input, f: F) -> ParseResult<Self::Output, Self::Error> {
        self(i).map(f)
    }
}

// Impls for tuples, eg. (A, B) where F: FnOnce(A, B) -> Out, A: FnOnce(Input) -> ParseResult<A, AE>, ...

impl ParseResult<T, E> {
    pub fn apply<A>(self, rhs: A) -> ParseResult<A::Output, A::Error>
      where A: ApplyArgs<T>,
             A::Error: From<E> {
        self.bind(|i, f| rhs.apply(i, f))
    }
}

Pros

  • Allows for arbitrary number of parameters
  • Compact syntax in usage
  • Does not treat functions and values differently
  • Applicative laws still apply if translated to tuples for multiple arguments

Cons

  • Fails to properly infer for closures in tuples like (|i| i.ret("foobar"), ...) (It cannot determine that i is Input) since it is using several traits, struct ParseResult -> trait ApplyArgs -> trait FnOnce -> concrete closure
  • Not very ergonomic at times since types have to be declared for all parameters.
  • Not possible to stack any values conditionally, the specific expression has to be known at compile time
  • Not as composable as the default since the tuples do not lend themselves well to partial application (which is kind of necessary to be able to express Applicative laws in a simple manner).
  • Horrible type impls for the tuples

Applicative laws

// Identity:
lhs_m.ret(|x| x).apply(f) === f(rhs_m)
// Composition:
lhs_m.ret(compose).apply((u, v)).apply(w) === u(rhs_m).apply(|i| v(i).apply(w))
// Homomorphism:
lhs_m.ret(f).apply(|i| i.ret(x)) === rhs_m.ret(f(x))
// Interchange:
u(lhs_m).apply(|i| i.ret(y)) === rhs_m.ret(apply_arg(y)).apply(u)

Stacking tuples

The idea here is to stack values into a tuple and then invoke a function with the given values:

i.ap(digitP).ap(digitP).ap(digitP).ap(digit).apply(IP)

Pros

  • Allows for somewhat conditional stacking of values
  • Follows Applicative laws if rewritten in Reverse Polish Notation with the stack-operator ´ap`

Cons

  • Treats values and functions differently
  • Treats tuples and non-tuples differently (cannot make a blanket impl for any T since it will conflict with all the impls for tuples).
  • Does not allow a smooth integration between monadic composition and applicative composition

Applicative laws

This version reverses the order of the apply operator arguments while still keeping the left to right ordering of the arguments supplied to ap:

// Identity:
lhs_m.ap(f).apply(|x| x) === f(rhs_m)
// Composition:
lhs_m.ap(w).apply(|i| i.ap(u).ap(v).apply(compose)) === rhs_m.ap(w).apply(v).apply(u)
// Homomorphism:
i.ap(|i| i.ret(x)).apply(f) === rhs_m.ret(f(x))
// Interchange:
i.ap(|i| i.ret(y)).apply(u) === i.ap(|i| i.ret(u)).apply(apply_arg(y))

Note that it does not play well with existing values in the Monad/Applicative since they are of type T which is not a tuple while the Applicative functions (ap and apply) only work on tuples in the applicative. IE. Input + ParseResult<T, E> is always a Monad, but only an Applicative if T is a tuple of some kind.