snoyberg/conduit

RFC: Stop using the type synonyms in library type signatures

snoyberg opened this issue ยท 22 comments

Reasons:

  • Adds extra things for people to learn
  • Difference between Source and Producer (or Sink and Consumer) is subtle and tricky
  • Relying on a poorly supported language extension

Downsides:

  • Makes the type signatures more confusing to read (arguable)
  • Possibly breaks lots of old docs

Please use ๐Ÿ‘ and ๐Ÿ‘Ž reactions on this description to express support/opposition to this change.

Assuming the change happens I'd like to see commentary in the code for folks learning the library like:

-- type signature equivalent func :: a -> b -> Source o m
func :: a -> b -> ConduitM () o m ()

I found the type synonyms confusing when I first got to reading the code. But I find the synonyms provide a helpful shorthand in my own code now that I understand what's going on.

After reading through the new tutorial my impression is that the type synonyms to not help at all. They might lead me to think that a Source is something else than a Conduit, and not just a special case.

Would it be feasible to drop the M from ConduitM, also ?

I've thought about that, but that would be some major breakage. It's
certainly tempting though.

On Fri, Oct 14, 2016, 5:53 PM Simon Michael notifications@github.com
wrote:

Would it be feasible to drop the M from ConduitM, also ?

โ€”
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#283 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AADBB6obN8ZSk6m1Wja5LdBr8eeo4HPNks5qz5degaJpZM4KSAgN
.

Is type Conduit = ConduitM enough to start with? The M is certainly pretty ugly.

The whole Void vs () is the nasty piece. I thought at one point the type synonyms helped to paper that over, but it doesn't seem to anymore, so their value is much less.

I'm sure this has already occurred to you, and been rejected for being silly in some way, but is rank2 polymorphism enough?

runConduit :: Monad m => (forall i o . ConduitM i o m r) -> m r

Might that encode that you don't look at the values of the input or output? (I assume this idea is silly, I certainly haven't sanity checked it, but it occurs to me so seems worth writing at least.)

type Conduit = ConduitM has almost as much breakage. It breaks anyone's code that is using the current Conduit type synonym. I'd say if we're going to pull of the bandaid, let's just pull it off entirely.

This is the first I'm hearing of the Void bit being the nasty piece. I'm actually quite open to something that I think is simpler than what you're saying: just using () as the output. It means that a pipeline is capable of yielding () values... but who really cares?

The rank2 polymorphism approach is something I'd like to avoid, this library has already suffered a lot from impredicative types popping up everywhere.

Ripping bandaids is always painful in the short term - not sure if it's a good idea or not.

It's the void/() asymmetry that is the problem. Some places are positive, some are negative, so I appreciate the duality. But juggling co/contra-variant in my head always makes me want a drink, so I imagine beginners don't find it any easier.

Point taken on impredicativity - it sounds like you're saying "plausible idea, but the tools will beat you to death for it". That makes sense. How about:

runConduit :: Monad m => ConduitM () o m r -> m r

I'm going to feed you inputs, and ignore your outputs entirely. The fact you've statically proved there are no outputs doesn't really matter. I think of it as somewhat equivalent to should >> take m a or m () as it's first argument.

Also note that if you could avoid Void entirely you could simplify the explanation to users. Void is a bit funky and confusing to everyone who isn't a Haskell programmer.

If you're going to go that far, we can actually just go for:

runConduit :: Monad m => ConduitM i o m r -> m r

We can promise to deliver a stream of any value, and then simply never produce them. The only question is: does this make errors from users more or less likely? I've gone under the assumption so far of enforcing the invariants is a good thing, but I'm open to the idea of just relaxing it. That could even be done with a minor version bump AFAICT.

The ripping bandaids part could be done in parts, such as by marking the type synonyms as deprecated now, and then having a conduit 1.3 release in a few months that s/ConduitM/Conduit. Another possible name to throw into the mix that doesn't break anything is ConduitT, which may make sense because it's really a monad transformer.

Possibilities to consider in all of this are also deprecating the other operators. All options are on the table here.

I was expecting runConduit would feed in input values, so would actually supply as many () as the thing wanted. I'm also expecting it to pull output values as long as it needs to, just ignore them. If you have a conduit which takes input, but never gets it, aren't you just going to get stuck and raise a runtime exception? Enforcing invariants to avoid errors is definitely good.

For operator depreciation, I tend to prefer stages - first make them direct synonyms at the source and say "use this instead". Then wait and deprecate them. Then remove them. No rush, but they should be clearly signposted as "not for future use".

If you have a conduit which takes input, but never gets it, aren't you just going to get stuck and raise a runtime exception?

No. The conduit await function returns a Maybe value, so if no input is available, you get Nothing. It's certainly possible for someone to do something nonsensical like:

await >>= maybe loop return

But I'm not too terribly worried about that. It's really about preventing someone from doing something like:

runConduit $ sourceFile "foo"
-- or
runConduit $ sinkFile "foo"

The way runConduit works today is that it does not provide an infinite stream of values, but instead immediately says "no values available." For draining: it can leverage the Void constraint and use absurd to ensure nothing is ever yielded, but switching to draining all output is possible.

I agree on slow and steady on the operators.

I note the first of those will almost certainly raise an error when they try and use the result from runConduit, so it's not as big a deal. Reducing the probability of the second by using () as the input does seem sensible, which leads back to:

runConduit :: Monad m => ConduitM () o m r -> m r

I appreciate the guarantees Void provides, but I'm skeptical they are worth the cost given the beginner-unfriendlyness of Void.

Is Void really that bad? Seems simple to me. Or is it just the positive/negative alteration that's hard?

Void is not trivial. Positive/negative is genuinely hard. I observe that the current tutorial says:

The choice of () and Void instead of, say, both () or both Void, is complicated. For now, I recommend just accepting that this makes sense

Or "magic lies here". Better to remove the magic than upset the muggles.

By that argument we wouldn't have a Monad type class. :)

I like Void, and know very little about conduit, but have been messing around with Haskell for a long time.


What is the reasoning for using () in the input? I mean () is a content-less signal, and Void is /actually no information/, and I think that distinction, while made blurry by the way the compiler handles case statements is still useful.

By that argument we wouldn't have a Monad type class. :)

Ah no, different concepts carry different amounts of water and have different degrees of difficulty associated with them. Conflation doesn't make your point stronger.

IME, the problems with Conduit for myself and for beginners were type tetris / thrashing, not Void. YMMV.

Yes, monads are a lot harder than Void. Though more clearly worth it because they are also useful for a lot more things.

Though haskell being lazy and having bottum makes the distinction between Void and () somewhat unreal.

In the context of conduit is there any benefit (beyond a
mental/semantic one) to Void as opposed to the venerable () unit type?
On Tue, 2016-10-18 at 16:58 -0700, Nolrai wrote:

Yes, monads are a lot harder than Void. Though more clearly worth it
because they are also useful for a lot more things.
Though haskell being lazy and having bottum makes the distinction
between Void and () somewhat unreal.
โ€”
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.

It's just a stronger statement of "this thing doesn't yield anything." There's certainly an argument to be made that discarding a stream of () values isn't any worse than getting to call absurd on a Void value.

All of this together is certainly pushing me towards a conduit-1.3 (or 2.0?) breaking change release. I'll probably start working properly on a branch soon, and would love feedback when I get deeper into it.

I agree that a 2.0 release would be wise given the number of breaking
changes. The conduit house will be in better order moving forward.
On Tue, 2016-10-18 at 19:50 -0700, Michael Snoyman wrote:

It's just a stronger statement of "this thing doesn't yield
anything." There's certainly an argument to be made that discarding a
stream of () values isn't any worse than getting to call absurd on a
Void value.
All of this together is certainly pushing me towards a conduit-1.3
(or 2.0?) breaking change release. I'll probably start working
properly on a branch soon, and would love feedback when I get deeper
into it.
โ€”
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.

@snoyberg One question is whether we believe beginners would think of the use-case of a source of () for counting/trigger purposes. That could lead to confusion if the real intent was, "pretend we're sending nothing but we actually could". That also means mistakes could be made if someone was in fact using () for this purpose.

Again, I haven't seen Void confuse anyone as it concerns Conduit. It was thrashing with types, operators that was holding up implementation. I could do another hands-on with a beginner if you want more data.

I started using conduit at the beginning of this year. The type synonyms didn't confuse me until I started reading about them.

From README.md:

A Producer is something that produces values, and may or may not consume

type Producer   m o   = forall i. ConduitM i  o    m ()

Makes sense. I can still await values, I just can't do anything with them because I know nothing about their type.

A Source produces values but does not consume any

type Source     m o   =           ConduitM () o    m ()

This doesn't make sense to me. I can still consume values with await and the type doesn't guarantee that I'll get Nothing, I could still get Just ().

It seems to me that the typedef should be:

type Source m o = ConduitM Void o m ()

Then, if I await, I'm guaranteed Nothing.

A Sink consumes values and and provides a return value, but produces none as an output stream

type Sink     i m   r =           ConduitM i  Void m r

This makes sense. If I wanted to produce an output, I would need something of type Void, which I can't get.

A Consumer is something that consumes values and provides a return value, but may or may not produce

type Consumer i m   r = forall o. ConduitM i  o    m r

I don't understand how it's possible to produce in this case. I would need something of type forall o. o and if I had that, I'd be able to get Void.

I've created a PR for just the deprecation aspect of things here: #307.