tweag/inline-java

Variadic safe API

Closed this issue · 30 comments

mboes commented

#133 and #135 both make call variadic, in the sense that the concrete type of call can change depending on the number of arguments provided. #135 only covers the unsafe API. Using the same approach in the safe API requires generalizing it to variadic functions in an abstract monad.

mboes commented

In #135 (comment), @facundominguez points out that GHC doesn't simplify a CallResult (m ()) constraint when m is kept polymorphic. Indeed, it would be unsound to do so, because (->) could be a concrete instantiation for m. This puts a dent in the type class based approach, because users will have to carry around "spurious" CallResult constraints in any code where the base monad is not concrete. Cleanly avoiding manual evidence (à la With*Args originally proposed) is pretty hard indeed.

cc @goldfirere

mboes commented

Though the advent of quantified constraints could help us solve this by hiding the constraint in another which we would have to carry anyways.

class (MonadIO m, forall b. CallResult (m b)) => MonadJava m

This works. It uses a sentinel value at the end of the variadic list of arguments. I guess it counts as supplying manual evidence, though it doesn't require counting the arguments of a function as WithNArgs.

{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE TypeFamilies #-}

import Control.Monad.IO.Class
import qualified Data.Coerce as Coerce

class Coercible a
instance Coercible ()
instance Coercible Int
instance Coercible Bool
instance Coercible Char

data Ty a
type family J a
type instance J (Ty a) = a
class IsReferenceType a
instance IsReferenceType (Ty Ref)

data S = S

class CallResult r where
instance Coercible b => CallResult (S -> m b) where
instance {-# OVERLAPPABLE #-} (Coercible a, CallResult r) => CallResult (a -> r) where


call :: (IsReferenceType (Ty a), Coerce.Coercible a (J (Ty a)), Coercible a, CallResult r)
     => a -> String -> r
call = undefined

data Ref = Ref
instance Coercible Ref

newRef :: IO Ref
newRef = undefined

f :: MonadIO m => m ()
f = call Ref "binary" 'x' (5 :: Int) S

test :: IO ()
test = do
  ref <- newRef
  () <- call ref "nullary" S
  _ :: Int <- call ref "nullary returning int" S
  () <- call ref "unary" True S
  () <- call ref "binary" 'x' (5 :: Int) S
  return ()

Introducing a MonadJava constraint is rather heavy-weight, if only to enable the variadic functions. If all our interface needs is MonadIO m, it will be easier to use if we can keep it simple as that.

The other reason to require MonadJava would be to ensure that uses of Java are executed in a context where references are deleted when an exception occurs. But I'm uncertain of what interface we would have to use it.

  runMonadJava :: MonadJava m => m a -> ? a

Using a monad transformer only for this sake looks heavy-weight again :)

As the interface stands today, the user is responsible for surrounding the monad computations with a couple of calls in IO (pushLocalFrame and popLocalFrame).

mboes commented

Introducing a MonadJava constraint is rather heavy-weight, if only to enable the variadic functions. If all our interface needs is MonadIO m, it will be easier to use if we can keep it simple as that.

I disagree. The ReaderT pattern, which requires a constraint of this kind, is very well-established by now, with much documentation on the topic and even widely adopted libraries meant specifically to espouse this style. Rio is one example, cited in a recent blog post. Our own capability is another one. There is nothing special about MonadIO. It's a capability like any other. Requiring a MonadJava capability to be able to make JNI calls sounds pretty reasonable. And hardly surprising to users given how frequent the ReaderT pattern is. At any rate less surprising than an extraneous sentinel value that is easy to forget and annoying to explain. Having to pass in an 0-ary extraneous constructor to make a call is what we had in the first version of #135.

This also seems to work:

class CallResult r where
instance {-# INCOHERENT #-} (Coercible b, MonadIO m) => CallResult (m b) where
instance {-# INCOHERENT #-} (Coercible a, CallResult r) => CallResult (a -> r) where
instance {-# OVERLAPPABLE #-} TypeError (Text "Expected ... ") => CallResult x where

f :: MonadIO m => m ()
f = call Ref "binary" 'x' (5 :: Int)

-- fails with: Could not deduce (MonadIO m) arising from a use of ‘call’
f2 :: Monad m => m ()
f2 = call Ref "binary" 'x' (5 :: Int)

-- fails with: Could not deduce instance (MonadIO Maybe) arising from a use of ‘call’
g :: Maybe ()
g = call Ref "binary" 'x' (5 :: Int)

-- fails with: Expected ...
h :: ()
h = call Ref "binary" 'x' (5 :: Int)

-- fails with: Expected ...
h :: a
h = call Ref "binary" 'x' (5 :: Int)

The algorithm for resolving overlapping instances is deterministic in this setup, if my analysis is correct.

I think that the approach in #137 (comment) will sometimes go wrong: it will work only when GHC (for whatever reason) knows right at the beginning how many arguments you pass to call. For example, what about this:

let partial = call ref "foo" x y
res1 <- partial "z1"
res2 <- partial "z2"

Looks reasonable -- but I think it will witness your incoherence.

I had to make CallResult (m b) overlappable as well, but otherwise your example works fine.
I made the code available here in case you want to crack it.

https://gist.github.com/facundominguez/2116a7eee0b0fa5eae37e9d58e4340cf

Perhaps one feature that doesn't allow incoherency to misbehave, is that CallResult (m b) requires MonadIO m or GHC will reject the program. I would expect a problematic incoherency to have a program accepted with the wrong instance used.

Thanks for posting that. Trouble lurks if we try to take partial out from the local let. And the trouble suggests enabling -XIncoherentInstances, which is clearly a bug, because it's already enabled. I will post a GHC bug.

Posted https://gitlab.haskell.org/ghc/ghc/issues/17513

I'm not sure why a local let behaves differently from a global one. But still, I think the global incoherent instance will cause trouble. I wouldn't do it this way.

Thx for taking a look. I'm afraid I still don't get your point. What is a global let?
Speculating that it could mean a top level definition I tried

partial = call 'x' (5 :: Int)

g :: forall m. MonadIO m => m ()
g = do
  res1 <- partial "z1"
  partial "z2"

But it still gives reasonable type errors, and they are fixed by adding a type signature for partial.

Otherwise, any source I could check for the trouble that incoherent instances can cause here, if only for my own illumination?

I'm afraid I don't have a source exactly, but {-# INCOHERENT #-} means that GHC might commit to an instance even before it's sure that the instance is the best choice. In our case, there is an erroring instance CallResult x. This means it will match against any type. Of course, if GHC can find a better instance, it will choose that better instance. But if GHC knows nothing in particular about the type x for which it is trying to solve CallResult x, it will commit to the erroring instance -- even if, after a little more work, it might choose one of the others.

Another way to think of it is this: GHC can -- at any moment, with no reason whatsoever, and unpredictably between minor releases -- decide to use the erroring instance. The fact that it doesn't choose that instance all the time is just a fluke. On a given version of GHC, the fluke will be deterministic, but it might not continue to work in the future.

There are cases where {-# INCOHERENT #-} can be the right way forward. But, here, we have a better, more predictable way to structure this all, and so there doesn't seem to be a need for this unpredictable feature.

With ghc-8.6.5, this doesn't seem to work:

class (MonadIO m, forall a. CallResult_ () (m a)) => MonadJava m where

f :: MonadJava m => m ()
f = call Ref "binary" 'x' (5 :: Int)

This is the error I get:

test2.hs:48:5: error:
    • Could not deduce (CallResult_ (NumArgs (m ())) (m ()))
        arising from a use of ‘call’
      from the context: MonadJava m
        bound by the type signature for:
                   f :: forall (m :: * -> *). MonadJava m => m ()
        at test2.hs:47:1-24
    • In the expression: call Ref "binary" 'x' (5 :: Int)
      In an equation for ‘f’: f = call Ref "binary" 'x' (5 :: Int)
   |
48 | f = call Ref "binary" 'x' (5 :: Int)
   |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Using

class (MonadIO m, forall a. NumArgs (m a) ~ Zero) => MonadJava m where

doesn't make a difference either.

And this works:

f :: (MonadIO m, Zero ~ NumArgs (m ())) => m ()
f = call Ref "binary" 'x' (5 :: Int)

It turns out that quantified constraints don't work well with a type family mentioned in the conclusion. HEAD reports this much more nicely than GHC 8.6, which silently ignored the quantified constraint.

Happily, we can work around this, with a bit more horsepower:

{-# LANGUAGE DataKinds #-}
{-# LANGUAGE KindSignatures #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE ConstraintKinds #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE QuantifiedConstraints #-}
{-# LANGUAGE UndecidableInstances #-}

module Variadic where

import Control.Monad.IO.Class
import qualified Data.Coerce as Coerce

class Coercible a
instance Coercible ()
instance Coercible Int
instance Coercible Bool
instance Coercible Char

data Ty a
type family J a
type instance J (Ty a) = a
class IsReferenceType a
instance IsReferenceType (Ty Ref)

data Nat = Zero | Succ Nat

type family NumArgs a where
  NumArgs (m b) = NumArgs' (IsArrowAppliedToOneArg m) b
  NumArgs _atom = Zero

type family NumArgs' is_arrow b where
  NumArgs' True b  = Succ (NumArgs b)
  NumArgs' False _ = Zero

type family IsArrowAppliedToOneArg m where
  IsArrowAppliedToOneArg ((->) _) = True
  IsArrowAppliedToOneArg _        = False

type CallResult r = CallResult_ (NumArgs r) r

class CallResult_ (num :: Nat) r
instance (Coercible b, MonadIO m) => CallResult_ Zero (m b)
instance (Coercible a, CallResult_ n r) => CallResult_ (Succ n) (a -> r)

call :: (IsReferenceType (Ty a), Coerce.Coercible a (J (Ty a)), Coercible a, CallResult r)
     => a -> String -> r
call = undefined

class (MonadIO m, IsArrowAppliedToOneArg m ~ False) => MonadJava m

data Ref = Ref
instance Coercible Ref

newRef :: IO Ref
newRef = undefined

f :: MonadJava m => m ()
f = call Ref "binary" 'x' (5 :: Int)

test :: IO ()
test = do
  ref <- newRef
  () <- call ref "nullary"
  _ :: Int <- call ref "nullary returning int"
  () <- call ref "unary" True
  () <- call ref "binary" 'x' (5 :: Int)
  return ()

It's not quite a thing of beauty, but it should work. But if users use e.g. MonadIO in their abstraction instead of MonadJava, they'll get an ugly type error.

Next up: trying to fix this using https://kcsongor.github.io/report-stuck-families/

Success.

{-# LANGUAGE DataKinds #-}
{-# LANGUAGE PolyKinds #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE ConstraintKinds #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE QuantifiedConstraints #-}
{-# LANGUAGE UndecidableInstances #-}

module Variadic where

import Control.Monad.IO.Class
import qualified Data.Coerce as Coerce
import GHC.TypeLits ( TypeError, ErrorMessage(..) )
import Data.Kind

class Coercible a
instance Coercible ()
instance Coercible Int
instance Coercible Bool
instance Coercible Char

data Ty a
type family J a
type instance J (Ty a) = a
class IsReferenceType a
instance IsReferenceType (Ty Ref)

data Nat = Zero | Succ Nat

type family NumArgs a where
  NumArgs (m b) = NumArgs' (IsArrowAppliedToOneArg m) b
  NumArgs _atom = Zero

type family NumArgs' is_arrow b where
  NumArgs' True b  = Succ (NumArgs b)
  NumArgs' False _ = Zero

type family IsArrowAppliedToOneArg m where
  IsArrowAppliedToOneArg ((->) _) = True
  IsArrowAppliedToOneArg _        = False

type family CanReduce n where
  CanReduce n = CanReduce' (TypeError (Text "Cannot be sure that your monad is not (->). Perhaps use a MonadJava constraint?")) n

type family CanReduce' err n :: Constraint where
  CanReduce' _   Zero     = ()
  CanReduce' err (Succ n) = CanReduce' err n

type CallResult r = (CanReduce (NumArgs r), CallResult_ (NumArgs r) r)

class CallResult_ (num :: Nat) r
instance (Coercible b, MonadIO m) => CallResult_ Zero (m b)
instance (Coercible a, CallResult_ n r) => CallResult_ (Succ n) (a -> r)

call :: (IsReferenceType (Ty a), Coerce.Coercible a (J (Ty a)), Coercible a, CallResult r)
     => a -> String -> r
call = undefined

class (MonadIO m, IsArrowAppliedToOneArg m ~ False) => MonadJava m

data Ref = Ref
instance Coercible Ref

newRef :: IO Ref
newRef = undefined

f :: MonadIO m => m ()
f = call Ref "binary" 'x' (5 :: Int)

g :: MonadJava m => m ()
g = call Ref "binary" 'x' (5 :: Int)

test :: IO ()
test = do
  ref <- newRef
  () <- call ref "nullary"
  _ :: Int <- call ref "nullary returning int"
  () <- call ref "unary" True
  () <- call ref "binary" 'x' (5 :: Int)
  return ()

On GHC 8.6 and HEAD, the program above gives me

Variadic.hs:71:5: error:
    • Cannot be sure that your monad is not (->). Perhaps use a MonadJava constraint?
    • In the expression: call Ref "binary" 'x' (5 :: Int)
      In an equation for ‘f’: f = call Ref "binary" 'x' (5 :: Int)

If I comment out f, g works well.

I can't guarantee that the implementation details don't leak out, but this just might work.

A valid question at this point is whether it's worth it to use this much power here. I think it probably is, but it might be worth thinking about for a few minutes.

Separately, these recent formulations don't seem to need a fixed number of CallResult_ instances, meaning there won't be an upper limit to the number of args you can take, and we can do without the Template Haskell stuff that generates all the instances. (This improvement is orthogonal to the others. Not sure why I didn't do it this way from the beginning.)

Separately, these recent formulations don't seem to need a fixed number of CallResult_ instances,

Right, this is what we finally committed for the unsafe interface in #135 .

I think I wouldn't mind the complexity here, if we can move to the solution with quantified constraints eventually. The mechanism that makes CanReduce' work looks like something prone to change across GHC versions.

mboes commented

@goldfirere given what you say above about the interaction between quantified constraints and type families, would fundeps work better?

In the last couple of proposals, the following leaks CallResult in the error message:

f1 :: ()
f1 = call Ref "binary" 'x' (5 :: Int)

Error:

test4.hs:74:6: error:
    • No instance for (CallResult_ 'Zero ())
        arising from a use of ‘call’
    • In the expression: call Ref "binary" 'x' (5 :: Int)
      In an equation for ‘f1’: f1 = call Ref "binary" 'x' (5 :: Int)
   |
74 | f1 = call Ref "binary" 'x' (5 :: Int)
   |      ^^^^^^^^^^^^^^^^^^^^^^^^^

HEAD reports this much more nicely than GHC 8.6

The error in #137 (comment) is the same with GHC 8.10.0.20191121.

@mboes While I haven't worked this out with fundeps, I don't think the fundamental issue around quantified constraints will change. Allowing a type family in the head of a quantified constraint risks making a non-confluent inference algorithm. Even if the whole thing were phrased with fundeps, I think the same underlying issue would present.

As for #137 (comment): This, too, can be solved. But my computer is in the shop and so I can't quite work it out at the moment. The way forward will look something like this:

instance (ConstrainApplication MonadIO Coercible r) => CallResult_ Zero r
type family ConstrainApplication (c_fun :: (Type -> Type) -> Constraint) (c_arg :: Type -> Constraint) r :: Constraint where
  ConstrainApplication c_fun c_arg (m a) = (c_fun m, c_arg a)
  ConstrainApplication _ _ other = TypeError ...

As for stability of the CanReduce trick: I agree that it's worth thinking about, but I'm not very concerned here. I cannot imagine another way for this to work. Well, I can't imagine another way for the type family reductions to work. Maybe the TypeError facility might evolve? Not sure. But we can tackle that when we get to it. Is there a facility in the test suite to check the errors that arise from erroneous client code?

ConstrainApplication works indeed.
After reading csongor post, I was expecting the reduction detection to depend on the strategy used by ghc to evaluate type families, but CanReduce does seem to have only one possible reduction path.

ConstrainApplication works indeed.

Or we could just use

instance (MonadIO m, r ~ m a) => CallResult_ Zero r

which produces

test4.hs:85:6: error:
    • Couldn't match type ‘()’ with ‘m0 a0’
        arising from a use of ‘call’
    • In the expression: call Ref "binary" 'x' (5 :: Int)
      In an equation for ‘f1’: f1 = call Ref "binary" 'x' (5 :: Int)
   |
85 | f1 = call Ref "binary" 'x' (5 :: Int)
   |      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

:)

ConstraintApplication can yield a better error message, I admit.

Is there a facility in the test suite to check the errors that arise from erroneous client code?

There is not, so far.

#139 materializes all of this hackery in the safe interface.

Or we could just use

Indeed. Sometimes I use more hammers than necessary -- thanks for finding this easier solution.

Introducing a MonadJava constraint is rather heavy-weight,

There is one argument to be made in favor a simple approach that requires some more artifacts (like a sentinel to be used). And that is that inline-java is almost always to be preferred to the interface in Language.Java.Safe.

The quasiquoter is going to generate the code making the invocation of callStatic, and therefore, it doesn't matter if you have to use a sentinel at the end. The user is not going to write that code himself, so any additional effort to embellish it would hardly pay off.

This also highlights that users of inline-java don't need to care of MonadJava, it is only users of Language.Java.Safe which do.

I'm not really against the sentinel approach (though I would probably rename it to End instead of S). It's certainly much simpler. I think what we've done here is outlined several approaches, each with benefits and drawbacks. Now we choose one.

Especially if we're going to be generating this code, that's an argument in favor of the sentinel approach, which is simpler (and thus less likely to have strange interactions).

The sentinel approach is implemented in #140.

@mboes, I would be merging that unless you want to veer direction.