Proposal: Rename `Foldable1` and `Bifoldable1` and then move them into base
Closed this issue ยท 92 comments
I propose that we move Foldable1
and Bifoldable1
from semigroupoids to base, as has been discussed many times over the years, and has most recently been championed by @phadej. His write-up found here contains a large collection of useful links to context and prior discussion. I invite you to read that; I couldn't possibly summarise all the discussion here.
Before we move these classes to base, I propose that we get their names right, along the lines discussed by @chessai and others on this semigroupoids issue (Foldable1 -> Semifoldable
, Bifoldable1 -> Semibifoldable
, with matching renamings for class members).
Hence this proposal involves multiple steps:
- Release version 6.0.0 of
semigroupoids
with a mass-renaming (Foldable1
toSemifoldable
,Bifoldable1
toSemibifoldable
, etc.) - Update downstream packages of
semigroupoids
to use the new major version and hence the naming scheme. - Let the ecosystem settle for a while.
- Move
Semifoldable
andSemibifoldable
tobase
.
We could alternatively do the renaming and relocation as one step, but this approach leads to a rather unfortunate migration path for users. I would prefer they be two simple and discrete steps, giving the considerable downstream package ecosystem time to breathe in between.
This proposal an additive (non-breaking) change to base, and so that step is relatively easy. Moving other classes from semigroupoids
to base
is explicitly out of scope for this proposal. Most or all other classes (such as Semitraversable
) require breaking changes elsewhere in the hierarchy (such as wedging Semiapplicative
into the Monad
hierarchy).
I note that this is as much a proposal about semigroupoids
as it is about base
; hence we would need the support of the semigroupoids
maintainers(cc. @ekmett).
I still think that Foldable1
(or whichever name it's renamed to) should be moved out of semigroupoids
into own package (Bifoldable1
can be there on in bifunctors
), as semigroupoids
is just too dependency heavy as used as compatibility package. People don't want to depend on it. If semigroupoids
is used as a compat package, the adoption may be abysmal. (I know an example where a maintainer didn't want to depend on semigroups
, and Semigroup
instances were added only when the support for older GHCs were dropped to avoid non-uniform APIs).
I.e. First move & rename functionality into own package (like https://github.com/phadej/foldable1), then make semigroupoids
depend on it (like in ekmett/semigroupoids#87).
I don't see a need to wait for ecosystem to settle Foldable1
is defined in handful of packages: instance.*Foldable1
search shows 50 packages https://hackage-search.serokell.io/?q=instance.*Foldable1 (also good part is Edwards packages) I.e. if change is done today to semigroupoids
, the next base
(released in 6 months?) could well have Foldable1
in it.
I like @phadej's suggestion to factor out Foldable1
into a package on its own. Otherwise future packages, using Foldable1
from base
, will be forced either support only GHC 9.4+, or depend on relatively heavyweight semigroupoids
.
CC @ekmett @RyanGlScott on behalf of semigroupoids
.
I think it could help to split this task into three:
- Does CLC approve inclusion of
Foldable1
intobase
in principle? - If yes, is CLC in favor of renaming to
Semifoldable
? - If yes, which migration plan is recommended by CLC?
Dear CLC members, could you please answer questions 1 and 2, before we venture on bikeshedding 3?
As you'd expect, I'm yes on 1 and 2.
I'm slightly more inclined to keep current naming to simplify migration. While Semifoldable sounds great, semimaximum and semilast are horrible.
My proposal had different naming schemes. NonEmptyFoldable
naming scheme is not horrible. And doesn't try to make Semi
prefix to mean non empty. EDIT: semi
means "not completely: semigroup
is almost group, no inverse so no identity; semimaximum
doesn't make sense; semifoldr
more, as that's a foldr
without initial element; semifoldrMap
is questionable, foldrMap1
means "map 1, foldr rest" :)
Another
head
, maximum
and foldr1
already exist in base
.
These are good names, but they are partial. So there is argument
- for some methods and functions to be named even without an affix, i.e. have good names for the right things.
- or against, as that would cause name clashes (yet, people using
Foldable1
probably won't usemaximum
fromFoldable
orhead
fromData.List
; but these arePrelude
names, so inconvenience is present).
Hmm. Using unprefixed head
, last
, maximum
and minimum
for Semifoldable
could be a bold, but long-awaited move to pave an (extremely long-term) migration from partial Prelude
.
I'm yes on 1, no on 2. I think the *1
suffix is fine on the class name, but that we should clash with the Prelude on the function names. I agree with @Bodigrim that it will steer us in the right direction when the time comes to de-partialize the Prelude.
FWIW (not a CLC member) :
- moving Foldable1 to base is a great idea , +1
- renaming it to SemiFoldable (or NonEmptyFoldable, even better as @phadej suggests ) : +1 , because currently there are classes like Show1 and Eq1 in base for
* -> *
types (and Generic1 will land soon afaik). I'm concerned this overloaded meaning of1
would cause confusion.
Hmm. Using unprefixed
head
,last
,maximum
andminimum
forSemifoldable
could be a bold, but long-awaited move to pave an (extremely long-term) migration from partialPrelude
.
@Bodigrim this is a great suggestion and I'm sure would be well received.
Hmm. Using unprefixed
head
,last
,maximum
andminimum
forSemifoldable
could be a bold, but long-awaited move to pave an (extremely long-term) migration from partialPrelude
.
(Not a CLC member): I think this argument (re naming schemes) should be considered as particularly important.
I'd also consider Semi-
as way to jargon-y without actually having good precedent for it. Current Foldable1
seems fine; NonEmptyFoldable
seems ok, but nonEmpty-
for the class functions seems unergonomically verbose.
So, I feel like it would be Foldable1
or NonEmptyFoldable
for me, and unprefixed functions, at the current state of the discussion.
These are the two cents of a bystander.
(Not a CLC member)
I'd love to see typeclasses Foldable1
in base
. I'm -1 on renaming them. I don't think that renaming solves any problem. It'll introduce even more work for volunteers to migrate their packages and I don't get the reasons for asking people to do more work because new names somehow make more sense. They don't to me and I think that Foldable1
is already perfectly fine.
Regarding Eq1
and Show1
typeclasses: we now have QuantifiedConstraints
in GHC and those classes can be safely deprecated. So there's no need to continue this naming convention with the meaning "typeclasses for higher-kinded data types".
Let's not forget that there are multiple "user personas" involved here:
- people who have never used Foldable1 or
semigroupoids
: Perhaps they would start using folds over non-empty data once it's in base but they don't face breaking change. - authors who are currently relying on
semigroupoids
: they would be impacted by a renaming of Foldable1. - all the work around Generic: like it or not, Show1, Eq1 and Generic1 have been in base for a while and many libraries rely upon them. While we can argue that the
1
is not a suggestive naming choice, it's become canon by now.
Questions :
- is the deprecation of Show1, Eq1 "a thing"? What's the timeline and/or process for deprecating them ? I cannot wrap my head around this use of QuantifiedConstraints.
Generic1 and Generically have recently landed in base (https://gitlab.haskell.org/ghc/ghc/-/merge_requests/5726/diffs), should they be renamed too?
is the deprecation of Show1, Eq1 "a thing"?
It's not that simple. In particular Eq1 f
is strictly more powerful than forall a. Eq a => Eq (f a)
. We can do
applesAndOranges :: :: [Apple] -> [Orange] -> Bool
applesAndOranges = liftEq appleAndOrange where
appleAndOrange :: Apple -> Orange -> Bool
appleAndOrange = ...
i.e. compare containers of different values.
Similarly liftShowsPrec
would allow to show Set Apple
using canonical instance Set
showing function, but let the user supply how to show an Apple
. (I use Set
as an example because you cannot coerce
whole Set Apple
to Set MyApple
, and have a Show MyApple
instance you'd like).
So Eq1/Show1/...
add expressive power & vocabulary.
What will probably happen is that QuantifiedConstraints
will become a constraints of Eq1
, Show1
etc.:
class (forall a. Eq a => Eq (f a)) => Eq1 f where
liftEq :: (a -> b -> Bool) -> (f a -> f b -> Bool)
The effect is that instead of eq1 :: (Eq1 f, Eq a) => f a -> f a -> Bool
you could directly use ==
, as Eq1 f, Eq a
constraints will imply Eq (f a)
.
There is a Merge Request !4727, but it's currently a draft. I guess the change there will eventually be brought up to CLC.
Generic1
has been in base
essentially forever now. Since GHC-7.4 / base-4.6
. Renaming it would break quite a lot of code.
I don't see anything wrong with Generically
name.
I'm happy to go with whatever the CLC wants to do here, up to and including handing the package over to the CLC.
Naming issue aside, after looking into this a bit, I'm still pretty unclear why these classes can't just keep living in semigroupoids. I guess I've never yet had to use them in anger, so they still seem pretty rarely-occurring to me... but maybe I'm just not aware of what kinds of code people are writing these days outside of my own bubbles.
I wonder what the ratio of packages transitively depending on semigroupoids due to the need to write Foldable1/Bifoldable1 instances vs. all packages transitively depending on semigroupoids is. That is to say, how much would the transitive dependency set of semigroupoids be reduced if these classes got moved into base?
At least in my case, while I've never really touched these classes, semigroupoids is almost always in my transitive dependency set on any medium to large project already, so depending on it once more never seems very painful.
Also, if it's really the case that separating these classes out would help, perhaps another answer that wouldn't require the blessing of the CLC at all would just be to split the package (as insufferable as it is to suggest a tiny package like that).
Also, if it's really the case that separating these classes out would help, perhaps another answer that wouldn't require the blessing of the CLC at all would just be to split the package (as insufferable as it is to suggest a tiny package like that).
One argument is that without Foldable1
in base
there is no way Applicative
would get Apply
as a superclass (which naturally would pull Traversable1
into base
as well). Whether that (very very) long-term plan is viable is a separate question, but by not having Foldable1
in base
the answer is definitive: Apply
won't ever be in base
.
Otherwise I agree, tiny packages are fine: some
, these
, data-fix
, witherable
. I don't see a real need to roll these into the base
in any form.
There is a dilemma though, what if someone would need a kind-generics
instance for these
. Would kind-generics
maintainer depend on these
for that instance (i'm somewhat sure they wont). I know that these
won't depend on kind-generics
(as there are too many generics libraries to depend on). Luckily nobody yet needed that (or were happy to define an orphan in their application).
The last point is kind of a motivation for a (very quick, without process) migration of Data.ByteArray
from primitive
to base
. The motivation is that bytestring
and text
would like to use that type, and would need to depend on primitive
, so the option was to either make primitive
boot library or move the type to base
. I think the wrong choice was made, I'd preferred whole primitive
made into a boot library. (The compatibility story is still unclear to me in that move, will text
depend on primitive
for older base
?).
EDIT: or was it because ghc
itself needs ByteArray
? Still not a reason, ghc
library depends on a lot of libraries, primitive
could been one of them. (https://hackage.haskell.org/package/base-4.16.0.0/docs/GHC-Arr.html is in base
, but the proper package to use it is array
. That's another potential design.)
But if CLC wants to be consistent, then Foldable1
should be moved base
too, at least when haskell/containers#616 (nonempty Map
and Set
and ...) is closer to the finish line.
-1 to renaming.
Semi prefix gives a strong impression that it's weaker than the original, while it's the other way around.
I'm concerned about the confusion this could cause than confusion with Eq1, Show1, etc (they are kind of niche; I rarely use them). Even for experts who know what they are, it's a lot of work to rename the use sites.
Naming issue aside, after looking into this a bit, I'm still pretty unclear why these classes can't just keep living in semigroupoids. I guess I've never yet had to use them in anger, so they still seem pretty rarely-occurring to me... but maybe I'm just not aware of what kinds of code people are writing these days outside of my own bubbles.
It's true that it's not a huge inconvenience for intermediate to advanced users who are familiar with setting up projects and pulling in dependencies. However it would be nice to have this in base for people just learning the language. Part of teaching Haskell is teaching Haskell values such as "make invalid states unrepresentable". Nonempty structures are a clear example of this principle, and having better support for accessing them ergonomically in base would make that easier to teach.
Additionally, there are Foldable methods (minimum(By), maximum(By), foldl1, foldr1, etc.) that should have Foldable1 constraints, not Foldable constraints. Moving Foldable1 into base will help realise the long-term goal of moving these functions to the proper typeclass.
(not CLC) Just to summarise, there seems to be a weak consensus for importing Foldable1
and friends into base
(with the associated deprecation cycle for semigroupoids
) and against renaming them. AFAICT renaming is the most contentious issue, which would cause maintenance busywork on those that currently import those typeclasses (users of semigroupoids
).
Type class name
I don't see an issue with keeping Foldable1
. Indeed there are Eq1
, Ord1
, Show1
, which generalize base classes in a different direction, but is there a real ambiguity? Eq
constraints *
, while Eq1
constraints * -> *
. But Foldable
is already for * -> *
, what other non-contrived generalization is available?
Keeping an old name has a usual benefit of causing less breaking changes for clients and less work on updating imports.
Rationale for inclusion in base
From my perspective the biggest motivation to include Foldable1
into base
is that Foldable
is flawed and exposes users to unsafe functions. Right now every time you encounter Foldable.foldl1
or Foldable.maximum
, you do not know whether it is safe, because for some types it is and for others not at all.
Ideally we should put huge red warning signs "Unsafe partial function, provided for historical reasons, please consider using ... instead", but right now we cannot even suggest any safe replacement in base
or nearby. A learner just exploring Haskell in ghci
should not be pointed to installing semigroupoids
and rolling out their own implementations of foldl1
and maximum
.
The reduced dependency footprint is a lesser concern, but for the record the majority of libraries I work on do not depend on semigroupoids
.
I know that several alternative preludes ship Foldable1
, e. g., relude
and rebase
. @chshersh @nikita-volkov could you possibly comment on your motivation to do so?
Method names
I'm on board with a "fat" set of functions as outlined in @phadej's proposal. I believe that functions, providing total alternatives to partial functions in Prelude
and Data.Foldable
, should share the same name to simplify migration from unsafe to safe. This gives us fold{l,r}1{,'}
and m{ax,in}imum
and head
/ last
. Further, fold1
/ foldMap1
/ foldMap1'
should retain their existing names from Data.Semigroup.Foldable
.
Bifoldable1
I'm not convinced about moving Bifoldable1
to base
. At least my motivation described above does not fit for this case, and it seems to be much more niche, isn't it?
Migration
While adding Data.Foldable1
to base
is not a breaking change, to leverage added value and facilitate smoothless adoption, it is important to provide a low-dependency compatibility package foldable1
, providing the same interface for users of older GHCs. It seems that all ground work is already there (thanks @phadej) and we have @ekmett's blessing to go forward, so we just need to settle on naming.
@emilypi @chessai @cgibbard could you please opine on naming?
@Bodigrim if Foldable1
is moved to base
, then Bifoldable1
should be moved at least to bifunctors
package to avoid orphans: e.g. Join
has Bifoldable1 p => Foldable1 (Join p)
instance (currently in semigroupoids
). I'm sure Edward is fine with that too.
EDIT: And my plan was to use bifunctors
as compat package for Bifoldable1
anyway, so the practical difference is small.
CC @treeowl @sjakobi @mstksg on behalf of nonempty containers.
CC @chris-martin @argumatronic to get an educational perspective.
@Bodigrim One of the relude
goals is to be lightweight in terms of dependencies and depend on boot libraries exclusively (with unordered-containers
being the only exception to this rule). semigroupoids
is not a boot library so depending on it is not an option, considering project goals.
However, we still want to have the Foldable1
abstraction in relude
. The motivation for having this typeclass is outlined in the corresponding issue:
In short, having Foldable1
in relude
:
- Enables to write total versions of functions like
foldr1
,maximum1
,minimum1
, etc. - Allows to battle test in real projects the fact that something like
Foldable1
is in a standard library
@Bodigrim Adding Foldable1
to base
seems useful to me. I don't really have an opinion on Bifoldable1
โ I don't think I've used or wanted to use it before. I'm not a fan of the name Semifoldable
so far โ I don't think it suggests that it is about non-empty structures in any way.
I can't say much about the work on non-empty Map
s and Set
s in containers
. It seems like a nice feature addition to me in principle, but I'm unclear on the details, especially on the API changes involved. Note that there is a package nonempty-containers
, although I don't know whether or how it is related to the work in containers
.
If you'd like me to comment on anything else, please let me know! :)
If it comes down to a choice is between:
Foldable1
et al are moved into a minimalfoldable1
package; orFoldable1
at al are moved intobase
, and a minimalfoldable1
compatibility package is also provided
then I think the decision rests largely on what you want to communicate to the general Haskell-aware public.
Some connotations I expect casual observers are likely to perceive as a result of including Foldable1 in base:
- Foldable1 is something that a person who learns Haskell ought to be aware of. It is something that is recommended. It is not something exotic.
- If you're going to publish a library that defines and exports a datatype, you can be reasonably expected to provide a Foldable1 instance for your datatype, if applicable.
(Taking the further step of including Foldable1 in Prelude would convey the same things, of course, more emphatically.)
Moving it into base is a cultural change that makes it a part of what Haskell is. So, is that what you want to do?
(musings of a non-CLC member)
Foldable1 is something that a person who learns Haskell ought to be aware of. It is something that is recommended. It is not something exotic.
If you're going to publish a library that defines and exports a datatype, you can be reasonably expected to provide a Foldable1 instance for your datatype, if applicable.
I think these are reasonable goals, in line with enabling and promoting "more correct"/"more safe" Haskell coding practices ergonomically. I personally believe we should promote that, and as you wrote, including them in base
makes exactly this statement.
While additions to base
should be very thoroughly and well considered, lest we end up stuck with suboptimal representations for decades, cf. partial head
, a mere addition is not a breaking change, so after ample consideration this alone should not be contentious. Utilizing these new classes as constrains for existing functions is breaking, but obviously must be the path forward (potentially with a sufficient migration period and plan with warnings, etc.) if the committee decides to include the classes in base
.
CLC discussed the design of Foldable1
today. Current (non-binding) opinion is to retain Foldable1
name and have unsuffixed methods for m{ax,in}imum
and head
/ last
. CLC also tentatively supports adding Bifoldable1
to base
.
CLC did not come to a conclusion about following class members:
-- fold variants with premap.
-- Without this map, we cannot implement foldl using foldr etc.
foldrMap1 :: (a -> b) -> (a -> b -> b) -> t a -> b
foldlMap1' :: (a -> b) -> (b -> a -> b) -> t a -> b
foldlMap1 :: (a -> b) -> (b -> a -> b) -> t a -> b
foldrMap1' :: (a -> b) -> (a -> b -> b) -> t a -> b
My understanding is that the motivation for their inclusion is ekmett/semigroupoids#77 (comment), but could interested parties perhaps elaborate more? CC @andrewthad @phadej @recursion-ninja
I have had to roll my own foldlMap'
twice. Both times it involved consuming a linear, non-empty structure of "commands" into "effects."
How bad is it to just let those extra methods be functions? Do people actually want to define them? I know I would prefer to avoid doing so if possible.
I'm in favor of including Foldable1
and see it as part of the larger goal of "departializing" the Prelude
etc. I'd be happier with the change knowing that this wasn't just an ad hoc measure and that there is a plan or goal to clean up the Prelude
, normalize using non-empties, and generally working toward more correct/safer Haskell.
This goal, to be sure, will render most of the existing educational resources for Haskell "broken" to one degree or another, but since most of them were produced before NonEmpty
was even in base
and many (most?) don't even mention it or any functions over it. So, perhaps we should already be considering them broken in that regard. This could seriously change the things we show first (in absolute beginner tutorials, for example) for the better. If this were Twitter, I'd have to insert some clapping hands and say NORMALIZE NONEMPTY.
For my purposes, I'd prefer a naming scheme involving a mention of NonEmpty
but understand this is cumbersome to type, and we all know programmers hate typing, and I don't find generally find names themselves are a serious impediment to learning.
I don't have much occasion myself to use Bifoldable1
so have no comment on that.
I should add that advance warning of big-picture changes is also very helpful for tutorial maintainers. If the main reason the committee approves this proposal is because its members each privately hope that it will lead to larger changes (Departialization Of Prelude), I would personally find it preferable to see an explicit plan for the larger project, rather than watch the process unfold piece-by-piece without knowing the overall trajectory. Sometimes things get pulled into base and wholly embraced; sometimes things get pulled into base with greater ambitions that never end up getting fulfilled. Knowing what's going to happen informs what I'm going to choose to recommend that will best serve readers in the years to come. It would be much more comfortable to start emphasizing NonEmpty in intro courses because the committee has committed to supporting it, rather than to make a guess based on reading behind the lines in proposal discussions.
There is no explicit plan for coveted Departialization Of Prelude, and I'm not sure whether it is even possible in our lifetime, much less about how.
A more reachable goal is a partial departialization of Data.Foldable
. The very first step is at least to provide safe alternatives for those who are tired of Exception: Prelude.maximum: empty list
. This is what this proposal is basically about.
If Data.Foldable1
API gets traction and we see more mainstream usage of nonempty sets/maps/vectors, then maybe in 3-4 years we can start talking about gradual supplanting of the old partial API.
How bad is it to just let those extra methods be functions? Do people actually want to define them? I know I would prefer to avoid doing so if possible.
I think it would be fine to define them as functions within the module that defines and exports Foldable1
.
How bad is it to just let those extra methods be functions? Do people actually want to define them? I know I would prefer to avoid doing so if possible.
Note the default definitions:
foldMap1 :: Semigroup m => (a -> m) -> t a -> m
foldMap1 f = foldrMap1 f (\a m -> f a <> m)
foldrMap1 :: (a -> b) -> (a -> b -> b) -> t a -> b
foldrMap1 f g xs =
appFromMaybe (foldMap1 (FromMaybe #. h) xs) Nothing
where
h a Nothing = f a
h a (Just b) = g a b
Also IIRC my benchmarks (in https://github.com/phadej/foldable1) showed that when you can use foldrMap1 or foldlMap1
it's often more efficient. (Then foldr1Map defined in terms of foldMap1).
I'm curious why they're slower. I'm not opposed to the new methods, but can you explain what makes them helpful?
but can you explain what makes them helpful?
I honestly don't have any energy to discuss this issue. I'm very sorry.
EDIT: my implementation has maximum
implemented using foldMap1
, but it probably should use foldMap1'
, as that is changed in base
for its maximum
in the meantime.
EDIT: foldr1Map
is more general then foldr1
. General enough that we can define foldMap1
using foldrMap1
, I couldn't figure out whether we can define foldMap1
using foldr1
(I think it's impossible). So, if for performance sake users define foldr
, then they probably should define foldr1Map
too. If there are no such examples, then foldr1
should not be in the class either.
i found some energy:
Consider an alternative definition of NonEmpty
:
data NonEmpty2 a
= L a
| a :$ NonEmpty2 a
It's natural eliminator is foldr1Map
:
foldr1Map :: (a -> b -> b) -> (a -> b) -> NonEmpty2 a -> b
foldr1Map _ g (L a) = g a
foldr1Map f g (a :$ as) = f a (foldr1Map f g as)
Then we can look at usual NonEmpty
is "new" light,
it's one element in front of many cons:
data NonEmpty a = a :| [a]
Thus it's quite natural eliminator is foldl1Map
:
foldl1Map :: (b -> a -> b) -> (a -> b) -> NonEmpty a -> b
foldl1Map f g (a :| as) = foldl f (g a) as
How bad is it to just let those extra methods be functions? Do people actually want to define them? I know I would prefer to avoid doing so if possible.
You don't need to, they have default definitions using foldMap1
.
but they do very much make sense for NonEmpty
variants.
(i.e. linear containers @recursion-ninja mentioned).
They (left-folds) also make sense for Tree
.
@phadej the usual NonEmpty
comes out looking a tad awkward in that context, I think. data NR a = NR a (Reverse [] a)
would be a better fit for foldl1Map
, wouldn't it?
@phadej , yes, the strict version makes sense for that, and the lazy one seems to be the natural fold for the Reverse
version. Messy, but I'm starting to see why they're desirable.
The problem here seems to be that there's no real operation semantics required to be shared between instances of foldl
, foldl'
, foldr
, foldr'
, ... . The semantics depend on the precise implementation of the data type. foldl'
is the appropriate accumulating function for cons lists, whereas foldr'
is the appropriate one for snoc lists. foldr
is the appropriate way of lazily streaming a cons list, foldl
is the appropriate way for a snoc list. For tree types, where are intermediate between the two, the choice is more confusing. This makes it impossible to use Foldable
as a genuine abstraction with comprehensible performance implications. All that it can be used for is name overloading. This problem is now leaking into the discussion about Foldable1
.
The problem here seems to be that there's no real operation semantics required to be shared between instances of
foldl
,foldl'
,foldr
,foldr'
, ... .
I'm confused. Data.Foldable
gives precise laws, linking semantics of foldl
and foldr
.
Denotational semantics, yes. But denotationally there is no difference between foldl
, foldl'
and foldr . flip
. Operationally they are critically different! That's why I say there is no meaningful way to program Foldable
-polymorphically.
To be a little less abstract, suppose I define
sum :: Foldable t => t [Int] -> Int
sum = foldl' (+) 0
Then sum
works well for []
but poorly for SnocList
where:
data SnocList a = Nil | Snoc (SnocList a) a deriving Foldable
Therefore I find Foldable
to be a poor tool for generic programming.
That's why sum
is defined as
sum :: (Foldable t, Num a) => t a -> a
sum = getSum #. foldMap' Sum
in base
.
Having qualified or prefixed List.foldr
, Set.foldr
, etc is inconvenient. Type-classes can be used for adhoc polymorphism, there is nothing wrong with that, and Foldable
is good home for foldr
etc.
I'm not suggesting there's anything wrong with it (although I would call it overloading rather than ad hoc polymorphism) just that it seems to be at the root of the performance issues that you and David were discussing earlier.
And for the record, foldMap'
doesn't help. In the following code the sum
of a one million element list runs in constant space. sum
of a one million element SnocList
takes ten times as long and 45MB of heap! I suppose the correct solution is to roll ones own foldMap'
. I haven't thought of the consequences of that. But the point remains that foldl/r/'
are not sufficiently generic.
{-# LANGUAGE DeriveFoldable #-}
data SnocList a = Nil | Snoc (SnocList a) a deriving Foldable
fromTo :: Int -> Int -> SnocList Int
fromTo i n | i >= n = Nil
| otherwise = Snoc (fromTo (i+1) n) n
main :: IO ()
main = do
print (sum (fromTo 1 (1 * 1000 * 1000)))
print (sum [1..1 * 1000 * 1000])
(ghc-9.2 required, to get the sum in terms of foldMap'
)
Note: I'm only talking about foldrMap1
, foldlMap1
and foldMap1
here, but ofc the same applies to foldrMap1'
, foldlMap1'
and foldMap1'
.
At first, I wanted to ask why foldrMap1
/foldlMap1
are needed when we have foldMap1
, until I realized that they actually do something different. I'm not sure naming them foldrMap1
/foldlMap1
is a good idea, since it suggests (at least to me) a close relationship to foldMap1
(the real problem is rather that foldMap
took the name for meaning "combine the results with <>
", but it's too late to change that now). However, I can't think of anything better either, as the current names are pretty descriptive.
Responding to the request in #9 (comment) for the motivation behind foldrMap1
:
foldrMap1 :: (a -> b) -> (a -> b -> b) -> t a -> b
Let's say that I have a non-empty container and I want to encode everything in it as text and put commas between the elements. Here is what the builder type and functions might look like:
-- keeping this abstract for clarity
model Bldr where
data Builder
comma :: Builder
person :: Person -> Builder
Next, we have a non-empty set of Person
and we want to encode all people with commas between them:
encodeManyPeople :: NonEmptySet Person -> Builder
encodeManyPeople = foldrMap1 Bldr.person (\p acc -> Bldr.person p <> Bldr.comma <> acc)
This was what originally made me want these functions in the class.
Playing the devil's advocate:
@andrewthad, in that case you could (and probably should if you write Foldable1
polymorphic code) defined an auxiliary newtype
:
newtype CommaBuilder = CommaBuilder { unCommaBuilder :: Builder }
instance Semigroup CommaBuilder where
CommaBuilder x <> CommaBuilder y = CommaBuilder (x <> Bldr.comma <> y)
encodeManyPeople :: Foldable1 f => f Person -> Builder
encodeManyPeople xs = unCommaBuilder (foldMap1 (CommaBuilder . Bldr.person) xs)
-- or using some newtype library:
encodeManyPeople = alaf CommaBuilder foldMap1 Bldr.person
@phadej That's a neat trick. I had not considered that, and I think the performance should be the same. Haven't tested it though.
I think enough time has passed that discussion has settled. It seems as though we have a vague consensus on:
- Bringing
Foldable1
and Bifoldable1 into base - Not renaming them
- Using a new
foldable1
package as a compatibility package forFoldable1
- Using the existing
bifunctors
package as a compatibility package forBifoldable1
- Including the
foldrmap1
etc. methods in theFoldable1
class
Can I please get a CLC vote on this version of the proposal?
I'm +1.
I'm getting convinced that foldrMap1
and friends make sense. But if we put them as type class members, do we need to retain foldr1
as a class member as well? Is it ever possible to have a better implementation than foldr1 = foldrMap1 id
?
It would be nice if someone put together an MR to base
, so that we clearly see what we vote for.
@strake is it still correct, that you are prepared to donate foldable1
package name to a compatibility package, if this proposal is approved?
Is it ever possible to have a better implementation than foldr1 = foldrMap1 id?
Good point. I haven't found one.
It would be nice if someone put together an MR to base, so that we clearly see what we vote for.
I think you are still unclear about the naming. I understand that the naming scheme you want is
- not renaming
- having un-affixed names for
head
, ... i.e. - You are unsure whether
foldr1
,... should be members.
module Data.Foldable1 where
class Foldable t => Foldable1 t where
fold1 :: Semigroup m => t m -> m
foldMap1 :: Semigroup m => (a -> m) -> t a -> m
foldMap1' :: Semigroup m => (a -> m) -> t a -> m
-- These four may or may not be a memberse
foldr1 :: (a -> a -> a) -> t a -> a
foldr1' :: (a -> a -> a) -> t a -> a
foldl1 :: (a -> a -> a) -> t a -> a
foldl1' :: (a -> a -> a) -> t a -> a
toNonEmpty :: t a -> NonEmpty a
maximum :: Ord a => t a -> a
minimum :: Ord a => t a -> a
head :: t a -> a
last :: t a -> a
foldrMap1 :: (a -> b) -> (a -> b -> b) -> t a -> b
foldlMap1' :: (a -> b) -> (b -> a -> b) -> t a -> b
foldlMap1 :: (a -> b) -> (b -> a -> b) -> t a -> b
foldrMap1' :: (a -> b) -> (a -> b -> b) -> t a -> b
intercalate1 :: (Foldable1 t, Semigroup m) => m -> t m -> m
foldrM1 :: (Foldable1 t, Monad m) => (a -> a -> m a) -> t a -> m a
foldlM1 :: (Foldable1 t, Monad m) => (a -> a -> m a) -> t a -> m a
foldrMapM1 :: (Foldable1 t, Monad m) => (a -> m b) -> (a -> b -> m b) -> t a -> m b
foldlMapM1 :: (Foldable1 t, Monad m) => (a -> m b) -> (b -> a -> m b) -> t a -> m b
maximumBy :: Foldable1 t => (a -> a -> Ordering) -> t a -> a
minimumBy :: Foldable1 t => (a -> a -> Ordering) -> t a -> a
-- this comes for free from intercalate1 implementation
intercalateMap1 :: (Foldable1 t, Semigroup m) => m -> (a -> m) -> t a -> m
module Data.Bifoldable1 where
class Bifoldable p => Bifoldable1 p where
bifoldMap1 :: Semigroup s => (a -> s) -> (b -> s) -> p a b -> s
I'd suggest to nail this down before implementing.
(This is essentially Appendix: Foldable1 synopsis with clashing names in https://oleg.fi/foldable1-proposal3.html, correcting Bifunctor -> Bifoldable typo)
In line with recent developments, if there is no better implementation of foldX1Y
than foldXMap1Y id
, I think foldX1Y
should better live outside of the class. This will also make Foldable1
a bit less intimidating.
I.e.
module Data.Foldable1 where
class Foldable t => Foldable1 t where
fold1 :: Semigroup m => t m -> m
foldMap1 :: Semigroup m => (a -> m) -> t a -> m
foldMap1' :: Semigroup m => (a -> m) -> t a -> m
toNonEmpty :: t a -> NonEmpty a
maximum :: Ord a => t a -> a
minimum :: Ord a => t a -> a
head :: t a -> a
last :: t a -> a
foldrMap1 :: (a -> b) -> (a -> b -> b) -> t a -> b
foldlMap1' :: (a -> b) -> (b -> a -> b) -> t a -> b
foldlMap1 :: (a -> b) -> (b -> a -> b) -> t a -> b
foldrMap1' :: (a -> b) -> (a -> b -> b) -> t a -> b
-- These are trivially foldr1 = foldrMap1 id etc.
foldr1 :: Foldable1 t => (a -> a -> a) -> t a -> a
foldr1' :: Foldable1 t => (a -> a -> a) -> t a -> a
foldl1 :: Foldable1 t => (a -> a -> a) -> t a -> a
foldl1' :: Foldable1 t => (a -> a -> a) -> t a -> a
intercalate1 :: (Foldable1 t, Semigroup m) => m -> t m -> m
foldrM1 :: (Foldable1 t, Monad m) => (a -> a -> m a) -> t a -> m a
foldlM1 :: (Foldable1 t, Monad m) => (a -> a -> m a) -> t a -> m a
foldrMapM1 :: (Foldable1 t, Monad m) => (a -> m b) -> (a -> b -> m b) -> t a -> m b
foldlMapM1 :: (Foldable1 t, Monad m) => (a -> m b) -> (b -> a -> m b) -> t a -> m b
maximumBy :: Foldable1 t => (a -> a -> Ordering) -> t a -> a
minimumBy :: Foldable1 t => (a -> a -> Ordering) -> t a -> a
-- this comes for free from intercalate1 implementation
intercalateMap1 :: (Foldable1 t, Semigroup m) => m -> (a -> m) -> t a -> m
module Data.Bifoldable1 where
class Bifoldable p => Bifoldable1 p where
bifoldMap1 :: Semigroup s => (a -> s) -> (b -> s) -> p a b -> s
?
Yes, at least from my point of view.
I really don't see the point of adding Bifoldable1
to base
. I never even did or want to use Bifoldable
and I think Bifoldable1
is even more niche. The only instances for Bifoldable
seem to be for Either
, tuples and wrapper types, nothing particularly meaningful IMO.
I think that base
already has too many classes that are rather niche and don't belong in a standard library (Category
, Arrow
, MonadFix
, MonadZip
, Bifunctor
, Bitraversable
). So unless you can provide me a convincing use case for Bifoldable1
(other than "Foldable
has Foldable1
, so Bifoldable
needs Bifoldable1
"), I really don't see the point of adding it to base
.
I personally haven't gotten anything from Bifoldable
or Bitraversable
. Biapplicative
can do some neat tricks tricks, but the only reason to add it to base
would be to make traverseBia
a method of Traversable
.
IMO CLC could just vote separately whether Bifoldable1
should be moved to base or not. It can be moved to bifunctors
. That's fine too.
FWIW I have plenty of use cases for Bifoldable
/ Bitraversable
, and some of them could use Bifoldable1
as well. For example, here is a data type which can benefit from it:
data Statement var expr
= Set var expr
| Let var expr
| Trace expr
| Declare var
I'm not particularly bullish on this, but adding Data.Bifoldable1
does not cost us much and seems logical to me.
Dear CLC members, let's hold a signal (non-binding) vote before we ask someone to put up an MR. Are you happy with an interface proposed in #9 (comment)? Do you approve Bifoldable1
or would you rather have a separate proposal for it?
(if you have more than two options on the table that are not 100% independent, instead of multiple votes, I can strongly recommend using ranked voting and picking the Condorcet winner. This simplifies many things, and you don't have to be shy adding more variants to the ballot for example.)
I'm in favour of both that interface and Bifoldable1
I think we might as well add it all in one go. Saves us the trouble of adding Bifoldable1
later.
One more example for Bifoldable1
. Here is a type of expressions:
data Expr t
= Add (Expr t) (Expr t)
| Mul (Expr t) (Expr t)
| Literal t
It is Foldable1
. Now if we convert it to an unfixed representation we get
newtype Fix f = Fix (f (Fix f))
data ExprF t r
= Add r r
| Mul r r
| Literal t
type Expr t = Fix (ExprF t)
where ExprF
is Bifoldable1
.
Iโm in favor of both the interface as per #9 (comment) and adding Bifoldable1
in one go.
I will respond Monday.
@emilypi @cgibbard a kind reminder about the question in #9 (comment). I'm keen to make some progress before everyone is off for festive season.
I don't think I'm a roadblocking decision maker. That said...
- I think
Foldable1
looks great. - I'm not knowledgeable enough to comment on
Bifoldable1
. If I had made an effort to utilizeBifoldable
structures more, I might have some insight. Unless someone can identify a method to add toBifoldable1
, I think it's sufficient for now.
Iโm in favor of both the interface as per #9 (comment) and adding
Bifoldable1
in one go.
+1 from me
I don't have a very strong opinion either way on this, but it seems fine.
+1
+1
For anyone interested in making a MR, there's haddocks for a compatibility package of what I think is the most agreed implementation of Data.Foldable1
and Data.Bifoldable1
modules atm at https://oleg.fi/haddocks/foldable1/. The sources are at https://github.com/phadej/foldable1
Thanks, @phadej, I raised https://gitlab.haskell.org/ghc/ghc/-/merge_requests/8495
@phadej , Thank you for your tireless efforts throughout the Haskell ecosystem. This is great work.
Dear CLC members, let's vote on https://gitlab.haskell.org/ghc/ghc/-/merge_requests/8495 which adds Data.Foldable1
and Data.Bifoldable1
to base
. The design of the type classes matches #9 (comment), which we has approved above. The source code is extracted from https://github.com/phadej/foldable1, where you can find tests and benchmarks as well.
@tomjaguarpaw @chessai @cgibbard @emilypi @mixphix
+1 from me.
+1
+1
-1
Foldable
is a poorly-designed class (explanation elaborated upon in another thread) and I'm against propagating its design into other classes.
+1
Altogether 4 votes in favour and 1 against out of 6 possible, which is enough for the proposal to pass. Thanks all, approved.
I couldn't decide whether to vote +1 or -1 on it and realized that it wouldn't change the result anyway just as voting closed. :)
I'm still not sure this needs to be moved to base, but if people feel it does, I suppose it's fine.
fwiw @tomjaguarpaw i think you are correct in the long run, but that would be a separate proposal with less of a migratory burden than this one. This one accomplishes roughly a part of that migration that we'd need to do anyway.
I'm trying to summarise the state of this proposal as part of my volunteering effort to track the progress of all approved
CLC proposals.
Field | Value |
---|---|
Author | @gwils, @Bodigrim |
Status | merged |
base version |
4.18.0.0 |
Merge Request (MR) | https://gitlab.haskell.org/ghc/ghc/-/merge_requests/8495 |
Blocked by | nothing |
CHANGELOG entry | present |
Migration guide | not needed |
Please, let me know if you find any mistakes ๐