haskell/core-libraries-committee

Proposal: Rename `Foldable1` and `Bifoldable1` and then move them into base

Closed this issue ยท 92 comments

gwils commented

I propose that we move Foldable1 and Bifoldable1 from semigroupoids to base, as has been discussed many times over the years, and has most recently been championed by @phadej. His write-up found here contains a large collection of useful links to context and prior discussion. I invite you to read that; I couldn't possibly summarise all the discussion here.

Before we move these classes to base, I propose that we get their names right, along the lines discussed by @chessai and others on this semigroupoids issue (Foldable1 -> Semifoldable, Bifoldable1 -> Semibifoldable, with matching renamings for class members).

Hence this proposal involves multiple steps:

  1. Release version 6.0.0 of semigroupoids with a mass-renaming (Foldable1 to Semifoldable, Bifoldable1 to Semibifoldable, etc.)
  2. Update downstream packages of semigroupoids to use the new major version and hence the naming scheme.
  3. Let the ecosystem settle for a while.
  4. Move Semifoldable and Semibifoldable to base.

We could alternatively do the renaming and relocation as one step, but this approach leads to a rather unfortunate migration path for users. I would prefer they be two simple and discrete steps, giving the considerable downstream package ecosystem time to breathe in between.

This proposal an additive (non-breaking) change to base, and so that step is relatively easy. Moving other classes from semigroupoids to base is explicitly out of scope for this proposal. Most or all other classes (such as Semitraversable) require breaking changes elsewhere in the hierarchy (such as wedging Semiapplicative into the Monad hierarchy).

I note that this is as much a proposal about semigroupoids as it is about base; hence we would need the support of the semigroupoids maintainers(cc. @ekmett).

I still think that Foldable1 (or whichever name it's renamed to) should be moved out of semigroupoids into own package (Bifoldable1 can be there on in bifunctors), as semigroupoids is just too dependency heavy as used as compatibility package. People don't want to depend on it. If semigroupoids is used as a compat package, the adoption may be abysmal. (I know an example where a maintainer didn't want to depend on semigroups, and Semigroup instances were added only when the support for older GHCs were dropped to avoid non-uniform APIs).

I.e. First move & rename functionality into own package (like https://github.com/phadej/foldable1), then make semigroupoids depend on it (like in ekmett/semigroupoids#87).

I don't see a need to wait for ecosystem to settle Foldable1 is defined in handful of packages: instance.*Foldable1 search shows 50 packages https://hackage-search.serokell.io/?q=instance.*Foldable1 (also good part is Edwards packages) I.e. if change is done today to semigroupoids, the next base (released in 6 months?) could well have Foldable1 in it.

I like @phadej's suggestion to factor out Foldable1 into a package on its own. Otherwise future packages, using Foldable1 from base, will be forced either support only GHC 9.4+, or depend on relatively heavyweight semigroupoids.

CC @ekmett @RyanGlScott on behalf of semigroupoids.

I think it could help to split this task into three:

  1. Does CLC approve inclusion of Foldable1 into base in principle?
  2. If yes, is CLC in favor of renaming to Semifoldable?
  3. If yes, which migration plan is recommended by CLC?

Dear CLC members, could you please answer questions 1 and 2, before we venture on bikeshedding 3?

gwils commented

As you'd expect, I'm yes on 1 and 2.

  1. Yes.
  2. I'm slightly more inclined to keep current naming to simplify migration. While Semifoldable sounds great, semimaximum and semilast are horrible.

CC @emilypi @chessai @cgibbard @cigsender

I'm slightly more inclined to keep current naming to simplify migration. While Semifoldable sounds great, semimaximum and semilast are horrible.

My proposal had different naming schemes. NonEmptyFoldable naming scheme is not horrible. And doesn't try to make Semi prefix to mean non empty. EDIT: semi means "not completely: semigroup is almost group, no inverse so no identity; semimaximum doesn't make sense; semifoldr more, as that's a foldr without initial element; semifoldrMap is questionable, foldrMap1 means "map 1, foldr rest" :)

Another

head, maximum and foldr1 already exist in base.
These are good names, but they are partial. So there is argument

  • for some methods and functions to be named even without an affix, i.e. have good names for the right things.
  • or against, as that would cause name clashes (yet, people using Foldable1 probably won't use maximum from Foldable or head from Data.List; but these are Prelude names, so inconvenience is present).

Hmm. Using unprefixed head, last, maximum and minimum for Semifoldable could be a bold, but long-awaited move to pave an (extremely long-term) migration from partial Prelude.

I'm yes on 1, no on 2. I think the *1 suffix is fine on the class name, but that we should clash with the Prelude on the function names. I agree with @Bodigrim that it will steer us in the right direction when the time comes to de-partialize the Prelude.

FWIW (not a CLC member) :

  • moving Foldable1 to base is a great idea , +1
  • renaming it to SemiFoldable (or NonEmptyFoldable, even better as @phadej suggests ) : +1 , because currently there are classes like Show1 and Eq1 in base for * -> * types (and Generic1 will land soon afaik). I'm concerned this overloaded meaning of 1 would cause confusion.

Hmm. Using unprefixed head, last, maximum and minimum for Semifoldable could be a bold, but long-awaited move to pave an (extremely long-term) migration from partial Prelude.

@Bodigrim this is a great suggestion and I'm sure would be well received.

szabi commented

Hmm. Using unprefixed head, last, maximum and minimum for Semifoldable could be a bold, but long-awaited move to pave an (extremely long-term) migration from partial Prelude.

(Not a CLC member): I think this argument (re naming schemes) should be considered as particularly important.

I'd also consider Semi- as way to jargon-y without actually having good precedent for it. Current Foldable1 seems fine; NonEmptyFoldable seems ok, but nonEmpty- for the class functions seems unergonomically verbose.

So, I feel like it would be Foldable1 or NonEmptyFoldable for me, and unprefixed functions, at the current state of the discussion.

These are the two cents of a bystander.

(Not a CLC member)

I'd love to see typeclasses Foldable1 in base. I'm -1 on renaming them. I don't think that renaming solves any problem. It'll introduce even more work for volunteers to migrate their packages and I don't get the reasons for asking people to do more work because new names somehow make more sense. They don't to me and I think that Foldable1 is already perfectly fine.

Regarding Eq1 and Show1 typeclasses: we now have QuantifiedConstraints in GHC and those classes can be safely deprecated. So there's no need to continue this naming convention with the meaning "typeclasses for higher-kinded data types".

Let's not forget that there are multiple "user personas" involved here:

  • people who have never used Foldable1 or semigroupoids : Perhaps they would start using folds over non-empty data once it's in base but they don't face breaking change.
  • authors who are currently relying on semigroupoids: they would be impacted by a renaming of Foldable1.
  • all the work around Generic: like it or not, Show1, Eq1 and Generic1 have been in base for a while and many libraries rely upon them. While we can argue that the 1 is not a suggestive naming choice, it's become canon by now.

Questions :

  • is the deprecation of Show1, Eq1 "a thing"? What's the timeline and/or process for deprecating them ? I cannot wrap my head around this use of QuantifiedConstraints.
  • Generic1 and Generically have recently landed in base (https://gitlab.haskell.org/ghc/ghc/-/merge_requests/5726/diffs), should they be renamed too?

is the deprecation of Show1, Eq1 "a thing"?

It's not that simple. In particular Eq1 f is strictly more powerful than forall a. Eq a => Eq (f a). We can do

applesAndOranges :: :: [Apple] -> [Orange] -> Bool
applesAndOranges = liftEq appleAndOrange where
  appleAndOrange :: Apple -> Orange -> Bool
  appleAndOrange = ...

i.e. compare containers of different values.

Similarly liftShowsPrec would allow to show Set Apple using canonical instance Set showing function, but let the user supply how to show an Apple. (I use Set as an example because you cannot coerce whole Set Apple to Set MyApple, and have a Show MyApple instance you'd like).

So Eq1/Show1/... add expressive power & vocabulary.

What will probably happen is that QuantifiedConstraints will become a constraints of Eq1, Show1 etc.:

class (forall a. Eq a => Eq (f a)) => Eq1 f where
  liftEq :: (a -> b -> Bool) -> (f a -> f b -> Bool)

The effect is that instead of eq1 :: (Eq1 f, Eq a) => f a -> f a -> Bool you could directly use ==, as Eq1 f, Eq a constraints will imply Eq (f a).

There is a Merge Request !4727, but it's currently a draft. I guess the change there will eventually be brought up to CLC.


Generic1 has been in base essentially forever now. Since GHC-7.4 / base-4.6. Renaming it would break quite a lot of code.

I don't see anything wrong with Generically name.

I'm happy to go with whatever the CLC wants to do here, up to and including handing the package over to the CLC.

Naming issue aside, after looking into this a bit, I'm still pretty unclear why these classes can't just keep living in semigroupoids. I guess I've never yet had to use them in anger, so they still seem pretty rarely-occurring to me... but maybe I'm just not aware of what kinds of code people are writing these days outside of my own bubbles.

I wonder what the ratio of packages transitively depending on semigroupoids due to the need to write Foldable1/Bifoldable1 instances vs. all packages transitively depending on semigroupoids is. That is to say, how much would the transitive dependency set of semigroupoids be reduced if these classes got moved into base?

At least in my case, while I've never really touched these classes, semigroupoids is almost always in my transitive dependency set on any medium to large project already, so depending on it once more never seems very painful.

Also, if it's really the case that separating these classes out would help, perhaps another answer that wouldn't require the blessing of the CLC at all would just be to split the package (as insufferable as it is to suggest a tiny package like that).

Also, if it's really the case that separating these classes out would help, perhaps another answer that wouldn't require the blessing of the CLC at all would just be to split the package (as insufferable as it is to suggest a tiny package like that).

One argument is that without Foldable1 in base there is no way Applicative would get Apply as a superclass (which naturally would pull Traversable1 into base as well). Whether that (very very) long-term plan is viable is a separate question, but by not having Foldable1 in base the answer is definitive: Apply won't ever be in base.

Otherwise I agree, tiny packages are fine: some, these, data-fix, witherable. I don't see a real need to roll these into the base in any form.

There is a dilemma though, what if someone would need a kind-generics instance for these. Would kind-generics maintainer depend on these for that instance (i'm somewhat sure they wont). I know that these won't depend on kind-generics (as there are too many generics libraries to depend on). Luckily nobody yet needed that (or were happy to define an orphan in their application).

The last point is kind of a motivation for a (very quick, without process) migration of Data.ByteArray from primitive to base. The motivation is that bytestring and text would like to use that type, and would need to depend on primitive, so the option was to either make primitive boot library or move the type to base. I think the wrong choice was made, I'd preferred whole primitive made into a boot library. (The compatibility story is still unclear to me in that move, will text depend on primitive for older base?).

EDIT: or was it because ghc itself needs ByteArray? Still not a reason, ghc library depends on a lot of libraries, primitive could been one of them. (https://hackage.haskell.org/package/base-4.16.0.0/docs/GHC-Arr.html is in base, but the proper package to use it is array. That's another potential design.)

But if CLC wants to be consistent, then Foldable1 should be moved base too, at least when haskell/containers#616 (nonempty Map and Set and ...) is closer to the finish line.

-1 to renaming.

Semi prefix gives a strong impression that it's weaker than the original, while it's the other way around.
I'm concerned about the confusion this could cause than confusion with Eq1, Show1, etc (they are kind of niche; I rarely use them). Even for experts who know what they are, it's a lot of work to rename the use sites.

Naming issue aside, after looking into this a bit, I'm still pretty unclear why these classes can't just keep living in semigroupoids. I guess I've never yet had to use them in anger, so they still seem pretty rarely-occurring to me... but maybe I'm just not aware of what kinds of code people are writing these days outside of my own bubbles.

It's true that it's not a huge inconvenience for intermediate to advanced users who are familiar with setting up projects and pulling in dependencies. However it would be nice to have this in base for people just learning the language. Part of teaching Haskell is teaching Haskell values such as "make invalid states unrepresentable". Nonempty structures are a clear example of this principle, and having better support for accessing them ergonomically in base would make that easier to teach.

Additionally, there are Foldable methods (minimum(By), maximum(By), foldl1, foldr1, etc.) that should have Foldable1 constraints, not Foldable constraints. Moving Foldable1 into base will help realise the long-term goal of moving these functions to the proper typeclass.

(not CLC) Just to summarise, there seems to be a weak consensus for importing Foldable1 and friends into base (with the associated deprecation cycle for semigroupoids ) and against renaming them. AFAICT renaming is the most contentious issue, which would cause maintenance busywork on those that currently import those typeclasses (users of semigroupoids).

Type class name

I don't see an issue with keeping Foldable1. Indeed there are Eq1, Ord1, Show1, which generalize base classes in a different direction, but is there a real ambiguity? Eq constraints *, while Eq1 constraints * -> *. But Foldable is already for * -> *, what other non-contrived generalization is available?

Keeping an old name has a usual benefit of causing less breaking changes for clients and less work on updating imports.

Rationale for inclusion in base

From my perspective the biggest motivation to include Foldable1 into base is that Foldable is flawed and exposes users to unsafe functions. Right now every time you encounter Foldable.foldl1 or Foldable.maximum, you do not know whether it is safe, because for some types it is and for others not at all.

Ideally we should put huge red warning signs "Unsafe partial function, provided for historical reasons, please consider using ... instead", but right now we cannot even suggest any safe replacement in base or nearby. A learner just exploring Haskell in ghci should not be pointed to installing semigroupoids and rolling out their own implementations of foldl1 and maximum.

The reduced dependency footprint is a lesser concern, but for the record the majority of libraries I work on do not depend on semigroupoids.

I know that several alternative preludes ship Foldable1, e. g., relude and rebase. @chshersh @nikita-volkov could you possibly comment on your motivation to do so?

Method names

I'm on board with a "fat" set of functions as outlined in @phadej's proposal. I believe that functions, providing total alternatives to partial functions in Prelude and Data.Foldable, should share the same name to simplify migration from unsafe to safe. This gives us fold{l,r}1{,'} and m{ax,in}imum and head / last. Further, fold1 / foldMap1 / foldMap1' should retain their existing names from Data.Semigroup.Foldable.

Bifoldable1

I'm not convinced about moving Bifoldable1 to base. At least my motivation described above does not fit for this case, and it seems to be much more niche, isn't it?

Migration

While adding Data.Foldable1 to base is not a breaking change, to leverage added value and facilitate smoothless adoption, it is important to provide a low-dependency compatibility package foldable1, providing the same interface for users of older GHCs. It seems that all ground work is already there (thanks @phadej) and we have @ekmett's blessing to go forward, so we just need to settle on naming.


@emilypi @chessai @cgibbard could you please opine on naming?

@Bodigrim if Foldable1 is moved to base, then Bifoldable1 should be moved at least to bifunctors package to avoid orphans: e.g. Join has Bifoldable1 p => Foldable1 (Join p) instance (currently in semigroupoids). I'm sure Edward is fine with that too.

EDIT: And my plan was to use bifunctors as compat package for Bifoldable1 anyway, so the practical difference is small.

CC @treeowl @sjakobi @mstksg on behalf of nonempty containers.
CC @chris-martin @argumatronic to get an educational perspective.

@Bodigrim One of the relude goals is to be lightweight in terms of dependencies and depend on boot libraries exclusively (with unordered-containers being the only exception to this rule). semigroupoids is not a boot library so depending on it is not an option, considering project goals.

However, we still want to have the Foldable1 abstraction in relude. The motivation for having this typeclass is outlined in the corresponding issue:

In short, having Foldable1 in relude:

  • Enables to write total versions of functions like foldr1, maximum1, minimum1, etc.
  • Allows to battle test in real projects the fact that something like Foldable1 is in a standard library

@Bodigrim Adding Foldable1 to base seems useful to me. I don't really have an opinion on Bifoldable1 โ€“ I don't think I've used or wanted to use it before. I'm not a fan of the name Semifoldable so far โ€“ I don't think it suggests that it is about non-empty structures in any way.

I can't say much about the work on non-empty Maps and Sets in containers. It seems like a nice feature addition to me in principle, but I'm unclear on the details, especially on the API changes involved. Note that there is a package nonempty-containers, although I don't know whether or how it is related to the work in containers.

If you'd like me to comment on anything else, please let me know! :)

If it comes down to a choice is between:

  1. Foldable1 et al are moved into a minimal foldable1 package; or
  2. Foldable1 at al are moved into base, and a minimal foldable1 compatibility package is also provided

then I think the decision rests largely on what you want to communicate to the general Haskell-aware public.

Some connotations I expect casual observers are likely to perceive as a result of including Foldable1 in base:

  • Foldable1 is something that a person who learns Haskell ought to be aware of. It is something that is recommended. It is not something exotic.
  • If you're going to publish a library that defines and exports a datatype, you can be reasonably expected to provide a Foldable1 instance for your datatype, if applicable.

(Taking the further step of including Foldable1 in Prelude would convey the same things, of course, more emphatically.)

Moving it into base is a cultural change that makes it a part of what Haskell is. So, is that what you want to do?

szabi commented

(musings of a non-CLC member)

  • Foldable1 is something that a person who learns Haskell ought to be aware of. It is something that is recommended. It is not something exotic.

  • If you're going to publish a library that defines and exports a datatype, you can be reasonably expected to provide a Foldable1 instance for your datatype, if applicable.

I think these are reasonable goals, in line with enabling and promoting "more correct"/"more safe" Haskell coding practices ergonomically. I personally believe we should promote that, and as you wrote, including them in base makes exactly this statement.

While additions to base should be very thoroughly and well considered, lest we end up stuck with suboptimal representations for decades, cf. partial head, a mere addition is not a breaking change, so after ample consideration this alone should not be contentious. Utilizing these new classes as constrains for existing functions is breaking, but obviously must be the path forward (potentially with a sufficient migration period and plan with warnings, etc.) if the committee decides to include the classes in base.

CLC discussed the design of Foldable1 today. Current (non-binding) opinion is to retain Foldable1 name and have unsuffixed methods for m{ax,in}imum and head / last. CLC also tentatively supports adding Bifoldable1 to base.

CLC did not come to a conclusion about following class members:

    -- fold variants with premap.
    -- Without this map, we cannot implement foldl using foldr etc.
    foldrMap1  :: (a -> b) -> (a -> b -> b) -> t a -> b
    foldlMap1' :: (a -> b) -> (b -> a -> b) -> t a -> b
    foldlMap1  :: (a -> b) -> (b -> a -> b) -> t a -> b
    foldrMap1' :: (a -> b) -> (a -> b -> b) -> t a -> b

My understanding is that the motivation for their inclusion is ekmett/semigroupoids#77 (comment), but could interested parties perhaps elaborate more? CC @andrewthad @phadej @recursion-ninja

I have had to roll my own foldlMap' twice. Both times it involved consuming a linear, non-empty structure of "commands" into "effects."

How bad is it to just let those extra methods be functions? Do people actually want to define them? I know I would prefer to avoid doing so if possible.

I'm in favor of including Foldable1 and see it as part of the larger goal of "departializing" the Prelude etc. I'd be happier with the change knowing that this wasn't just an ad hoc measure and that there is a plan or goal to clean up the Prelude, normalize using non-empties, and generally working toward more correct/safer Haskell.

This goal, to be sure, will render most of the existing educational resources for Haskell "broken" to one degree or another, but since most of them were produced before NonEmpty was even in base and many (most?) don't even mention it or any functions over it. So, perhaps we should already be considering them broken in that regard. This could seriously change the things we show first (in absolute beginner tutorials, for example) for the better. If this were Twitter, I'd have to insert some clapping hands and say NORMALIZE NONEMPTY.

For my purposes, I'd prefer a naming scheme involving a mention of NonEmpty but understand this is cumbersome to type, and we all know programmers hate typing, and I don't find generally find names themselves are a serious impediment to learning.

I don't have much occasion myself to use Bifoldable1 so have no comment on that.

I should add that advance warning of big-picture changes is also very helpful for tutorial maintainers. If the main reason the committee approves this proposal is because its members each privately hope that it will lead to larger changes (Departialization Of Prelude), I would personally find it preferable to see an explicit plan for the larger project, rather than watch the process unfold piece-by-piece without knowing the overall trajectory. Sometimes things get pulled into base and wholly embraced; sometimes things get pulled into base with greater ambitions that never end up getting fulfilled. Knowing what's going to happen informs what I'm going to choose to recommend that will best serve readers in the years to come. It would be much more comfortable to start emphasizing NonEmpty in intro courses because the committee has committed to supporting it, rather than to make a guess based on reading behind the lines in proposal discussions.

There is no explicit plan for coveted Departialization Of Prelude, and I'm not sure whether it is even possible in our lifetime, much less about how.

A more reachable goal is a partial departialization of Data.Foldable. The very first step is at least to provide safe alternatives for those who are tired of Exception: Prelude.maximum: empty list. This is what this proposal is basically about.

If Data.Foldable1 API gets traction and we see more mainstream usage of nonempty sets/maps/vectors, then maybe in 3-4 years we can start talking about gradual supplanting of the old partial API.

How bad is it to just let those extra methods be functions? Do people actually want to define them? I know I would prefer to avoid doing so if possible.

I think it would be fine to define them as functions within the module that defines and exports Foldable1.

How bad is it to just let those extra methods be functions? Do people actually want to define them? I know I would prefer to avoid doing so if possible.

Note the default definitions:

  foldMap1 :: Semigroup m => (a -> m) -> t a -> m
  foldMap1 f = foldrMap1 f (\a m -> f a <> m)

  foldrMap1 :: (a -> b) -> (a -> b -> b) -> t a -> b
  foldrMap1 f g xs =
      appFromMaybe (foldMap1 (FromMaybe #. h) xs) Nothing
    where
      h a Nothing  = f a
      h a (Just b) = g a b    

Also IIRC my benchmarks (in https://github.com/phadej/foldable1) showed that when you can use foldrMap1 or foldlMap1
it's often more efficient. (Then foldr1Map defined in terms of foldMap1).

For example
Screenshot from 2021-11-16 09-03-12

  • vanilla is all methods defined: fast
  • Just foldMap1 and foldr1Map variants are slower

For Tree it was a lot worse:

Screenshot from 2021-11-16 09-07-12

I'm curious why they're slower. I'm not opposed to the new methods, but can you explain what makes them helpful?

but can you explain what makes them helpful?

I honestly don't have any energy to discuss this issue. I'm very sorry.

EDIT: my implementation has maximum implemented using foldMap1, but it probably should use foldMap1', as that is changed in base for its maximum in the meantime.

EDIT: foldr1Map is more general then foldr1. General enough that we can define foldMap1 using foldrMap1, I couldn't figure out whether we can define foldMap1 using foldr1 (I think it's impossible). So, if for performance sake users define foldr, then they probably should define foldr1Map too. If there are no such examples, then foldr1 should not be in the class either.

i found some energy:

Consider an alternative definition of NonEmpty:

data NonEmpty2 a
    = L a
    | a :$ NonEmpty2 a

It's natural eliminator is foldr1Map:

foldr1Map :: (a -> b -> b) -> (a -> b) -> NonEmpty2 a -> b
foldr1Map _ g (L a)     = g a
foldr1Map f g (a :$ as) = f a (foldr1Map f g as)

Then we can look at usual NonEmpty is "new" light,
it's one element in front of many cons:

data NonEmpty a = a :| [a]

Thus it's quite natural eliminator is foldl1Map:

foldl1Map :: (b -> a -> b) -> (a -> b) -> NonEmpty a -> b
foldl1Map f g (a :| as) = foldl f (g a) as

@treeowl

How bad is it to just let those extra methods be functions? Do people actually want to define them? I know I would prefer to avoid doing so if possible.

You don't need to, they have default definitions using foldMap1.
but they do very much make sense for NonEmpty variants.
(i.e. linear containers @recursion-ninja mentioned).
They (left-folds) also make sense for Tree.

@phadej the usual NonEmpty comes out looking a tad awkward in that context, I think. data NR a = NR a (Reverse [] a) would be a better fit for foldl1Map, wouldn't it?

@treeowl not if you think of strict foldl'.

@phadej , yes, the strict version makes sense for that, and the lazy one seems to be the natural fold for the Reverse version. Messy, but I'm starting to see why they're desirable.

The problem here seems to be that there's no real operation semantics required to be shared between instances of foldl, foldl', foldr, foldr', ... . The semantics depend on the precise implementation of the data type. foldl' is the appropriate accumulating function for cons lists, whereas foldr' is the appropriate one for snoc lists. foldr is the appropriate way of lazily streaming a cons list, foldl is the appropriate way for a snoc list. For tree types, where are intermediate between the two, the choice is more confusing. This makes it impossible to use Foldable as a genuine abstraction with comprehensible performance implications. All that it can be used for is name overloading. This problem is now leaking into the discussion about Foldable1.

The problem here seems to be that there's no real operation semantics required to be shared between instances of foldl, foldl', foldr, foldr', ... .

I'm confused. Data.Foldable gives precise laws, linking semantics of foldl and foldr.

Denotational semantics, yes. But denotationally there is no difference between foldl, foldl' and foldr . flip. Operationally they are critically different! That's why I say there is no meaningful way to program Foldable-polymorphically.

To be a little less abstract, suppose I define

sum :: Foldable t => t [Int] -> Int
sum = foldl' (+) 0

Then sum works well for [] but poorly for SnocList where:

data SnocList a = Nil | Snoc (SnocList a) a deriving Foldable

Therefore I find Foldable to be a poor tool for generic programming.

That's why sum is defined as

sum :: (Foldable t, Num a) => t a -> a
sum = getSum #. foldMap' Sum

in base.

Having qualified or prefixed List.foldr, Set.foldr, etc is inconvenient. Type-classes can be used for adhoc polymorphism, there is nothing wrong with that, and Foldable is good home for foldr etc.

I'm not suggesting there's anything wrong with it (although I would call it overloading rather than ad hoc polymorphism) just that it seems to be at the root of the performance issues that you and David were discussing earlier.

And for the record, foldMap' doesn't help. In the following code the sum of a one million element list runs in constant space. sum of a one million element SnocList takes ten times as long and 45MB of heap! I suppose the correct solution is to roll ones own foldMap'. I haven't thought of the consequences of that. But the point remains that foldl/r/' are not sufficiently generic.

{-# LANGUAGE DeriveFoldable #-}

data SnocList a = Nil | Snoc (SnocList a) a deriving Foldable

fromTo :: Int -> Int -> SnocList Int
fromTo i n | i >= n = Nil
           | otherwise = Snoc (fromTo (i+1) n) n

main :: IO ()
main = do
  print (sum (fromTo 1 (1 * 1000 * 1000)))
  print (sum [1..1 * 1000 * 1000])

(ghc-9.2 required, to get the sum in terms of foldMap')

Note: I'm only talking about foldrMap1, foldlMap1 and foldMap1 here, but ofc the same applies to foldrMap1', foldlMap1' and foldMap1'.

At first, I wanted to ask why foldrMap1/foldlMap1 are needed when we have foldMap1, until I realized that they actually do something different. I'm not sure naming them foldrMap1/foldlMap1 is a good idea, since it suggests (at least to me) a close relationship to foldMap1 (the real problem is rather that foldMap took the name for meaning "combine the results with <>", but it's too late to change that now). However, I can't think of anything better either, as the current names are pretty descriptive.

Responding to the request in #9 (comment) for the motivation behind foldrMap1:

foldrMap1  :: (a -> b) -> (a -> b -> b) -> t a -> b

Let's say that I have a non-empty container and I want to encode everything in it as text and put commas between the elements. Here is what the builder type and functions might look like:

-- keeping this abstract for clarity
model Bldr where
  data Builder
  comma :: Builder
  person :: Person -> Builder

Next, we have a non-empty set of Person and we want to encode all people with commas between them:

encodeManyPeople :: NonEmptySet Person -> Builder
encodeManyPeople = foldrMap1 Bldr.person (\p acc -> Bldr.person p <> Bldr.comma <> acc)

This was what originally made me want these functions in the class.

Playing the devil's advocate:

@andrewthad, in that case you could (and probably should if you write Foldable1 polymorphic code) defined an auxiliary newtype:

newtype CommaBuilder = CommaBuilder { unCommaBuilder :: Builder }

instance Semigroup CommaBuilder where
  CommaBuilder x <> CommaBuilder y = CommaBuilder (x <> Bldr.comma <> y)
  
encodeManyPeople :: Foldable1 f => f Person -> Builder
encodeManyPeople xs = unCommaBuilder (foldMap1 (CommaBuilder . Bldr.person) xs)

-- or using some newtype library:
encodeManyPeople = alaf CommaBuilder foldMap1 Bldr.person

@phadej That's a neat trick. I had not considered that, and I think the performance should be the same. Haven't tested it though.

gwils commented

I think enough time has passed that discussion has settled. It seems as though we have a vague consensus on:

  • Bringing Foldable1 and Bifoldable1 into base
  • Not renaming them
  • Using a new foldable1 package as a compatibility package for Foldable1
  • Using the existing bifunctors package as a compatibility package for Bifoldable1
  • Including the foldrmap1 etc. methods in the Foldable1 class

Can I please get a CLC vote on this version of the proposal?

I'm +1.

I'm getting convinced that foldrMap1 and friends make sense. But if we put them as type class members, do we need to retain foldr1 as a class member as well? Is it ever possible to have a better implementation than foldr1 = foldrMap1 id?

It would be nice if someone put together an MR to base, so that we clearly see what we vote for.

@strake is it still correct, that you are prepared to donate foldable1 package name to a compatibility package, if this proposal is approved?

Is it ever possible to have a better implementation than foldr1 = foldrMap1 id?

Good point. I haven't found one.

It would be nice if someone put together an MR to base, so that we clearly see what we vote for.

I think you are still unclear about the naming. I understand that the naming scheme you want is

  • not renaming
  • having un-affixed names for head, ... i.e.
  • You are unsure whether foldr1,... should be members.
module Data.Foldable1 where

class Foldable t => Foldable1 t where
  fold1      :: Semigroup m => t m -> m
  foldMap1   :: Semigroup m => (a -> m) -> t a -> m
  foldMap1'  :: Semigroup m => (a -> m) -> t a -> m

  -- These four may or may not be a memberse
  foldr1     :: (a -> a -> a) -> t a -> a
  foldr1'    :: (a -> a -> a) -> t a -> a
  foldl1     :: (a -> a -> a) -> t a -> a
  foldl1'    :: (a -> a -> a) -> t a -> a

  toNonEmpty :: t a -> NonEmpty a

  maximum    :: Ord a => t a -> a
  minimum    :: Ord a => t a -> a
  head       :: t a -> a
  last       :: t a -> a

  foldrMap1  :: (a -> b) -> (a -> b -> b) -> t a -> b
  foldlMap1' :: (a -> b) -> (b -> a -> b) -> t a -> b
  foldlMap1  :: (a -> b) -> (b -> a -> b) -> t a -> b
  foldrMap1' :: (a -> b) -> (a -> b -> b) -> t a -> b

intercalate1 :: (Foldable1 t, Semigroup m) => m -> t m -> m
foldrM1      :: (Foldable1 t, Monad m) => (a -> a -> m a) -> t a -> m a
foldlM1      :: (Foldable1 t, Monad m) => (a -> a -> m a) -> t a -> m a
foldrMapM1   :: (Foldable1 t, Monad m) => (a -> m b) -> (a -> b -> m b) -> t a -> m b
foldlMapM1   :: (Foldable1 t, Monad m) => (a -> m b) -> (b -> a -> m b) -> t a -> m b
maximumBy    :: Foldable1 t => (a -> a -> Ordering) -> t a -> a
minimumBy    :: Foldable1 t => (a -> a -> Ordering) -> t a -> a

-- this comes for free from intercalate1 implementation
intercalateMap1 :: (Foldable1 t, Semigroup m) => m -> (a -> m) -> t a -> m

module Data.Bifoldable1 where

class Bifoldable p => Bifoldable1 p where
  bifoldMap1 :: Semigroup s => (a -> s) -> (b -> s) -> p a b -> s

I'd suggest to nail this down before implementing.

(This is essentially Appendix: Foldable1 synopsis with clashing names in https://oleg.fi/foldable1-proposal3.html, correcting Bifunctor -> Bifoldable typo)

In line with recent developments, if there is no better implementation of foldX1Y than foldXMap1Y id, I think foldX1Y should better live outside of the class. This will also make Foldable1 a bit less intimidating.

I.e.

module Data.Foldable1 where

class Foldable t => Foldable1 t where
  fold1      :: Semigroup m => t m -> m
  foldMap1   :: Semigroup m => (a -> m) -> t a -> m
  foldMap1'  :: Semigroup m => (a -> m) -> t a -> m

  toNonEmpty :: t a -> NonEmpty a

  maximum    :: Ord a => t a -> a
  minimum    :: Ord a => t a -> a
  head       :: t a -> a
  last       :: t a -> a

  foldrMap1  :: (a -> b) -> (a -> b -> b) -> t a -> b
  foldlMap1' :: (a -> b) -> (b -> a -> b) -> t a -> b
  foldlMap1  :: (a -> b) -> (b -> a -> b) -> t a -> b
  foldrMap1' :: (a -> b) -> (a -> b -> b) -> t a -> b

-- These are trivially foldr1 = foldrMap1 id etc.
foldr1     :: Foldable1 t => (a -> a -> a) -> t a -> a
foldr1'    :: Foldable1 t => (a -> a -> a) -> t a -> a
foldl1     :: Foldable1 t => (a -> a -> a) -> t a -> a
foldl1'    :: Foldable1 t => (a -> a -> a) -> t a -> a

intercalate1 :: (Foldable1 t, Semigroup m) => m -> t m -> m
foldrM1      :: (Foldable1 t, Monad m) => (a -> a -> m a) -> t a -> m a
foldlM1      :: (Foldable1 t, Monad m) => (a -> a -> m a) -> t a -> m a
foldrMapM1   :: (Foldable1 t, Monad m) => (a -> m b) -> (a -> b -> m b) -> t a -> m b
foldlMapM1   :: (Foldable1 t, Monad m) => (a -> m b) -> (b -> a -> m b) -> t a -> m b
maximumBy    :: Foldable1 t => (a -> a -> Ordering) -> t a -> a
minimumBy    :: Foldable1 t => (a -> a -> Ordering) -> t a -> a

-- this comes for free from intercalate1 implementation
intercalateMap1 :: (Foldable1 t, Semigroup m) => m -> (a -> m) -> t a -> m

module Data.Bifoldable1 where

class Bifoldable p => Bifoldable1 p where
  bifoldMap1 :: Semigroup s => (a -> s) -> (b -> s) -> p a b -> s

?

Yes, at least from my point of view.

I really don't see the point of adding Bifoldable1 to base. I never even did or want to use Bifoldable and I think Bifoldable1 is even more niche. The only instances for Bifoldable seem to be for Either, tuples and wrapper types, nothing particularly meaningful IMO.

I think that base already has too many classes that are rather niche and don't belong in a standard library (Category, Arrow, MonadFix, MonadZip, Bifunctor, Bitraversable). So unless you can provide me a convincing use case for Bifoldable1 (other than "Foldable has Foldable1, so Bifoldable needs Bifoldable1"), I really don't see the point of adding it to base.

I personally haven't gotten anything from Bifoldable or Bitraversable. Biapplicative can do some neat tricks tricks, but the only reason to add it to base would be to make traverseBia a method of Traversable.

IMO CLC could just vote separately whether Bifoldable1 should be moved to base or not. It can be moved to bifunctors. That's fine too.

FWIW I have plenty of use cases for Bifoldable / Bitraversable, and some of them could use Bifoldable1 as well. For example, here is a data type which can benefit from it:

data Statement var expr
  = Set var expr
  | Let var expr 
  | Trace expr 
  | Declare var

I'm not particularly bullish on this, but adding Data.Bifoldable1 does not cost us much and seems logical to me.

Dear CLC members, let's hold a signal (non-binding) vote before we ask someone to put up an MR. Are you happy with an interface proposed in #9 (comment)? Do you approve Bifoldable1 or would you rather have a separate proposal for it?

(if you have more than two options on the table that are not 100% independent, instead of multiple votes, I can strongly recommend using ranked voting and picking the Condorcet winner. This simplifies many things, and you don't have to be shy adding more variants to the ballot for example.)

@nomeata good point, we can try it if a signal vote appears unclear.
CC @gwils @chessai @emilypi @cigsender @cgibbard

gwils commented

I'm in favour of both that interface and Bifoldable1

I think we might as well add it all in one go. Saves us the trouble of adding Bifoldable1 later.

One more example for Bifoldable1. Here is a type of expressions:

data Expr t
  = Add (Expr t) (Expr t)
  | Mul (Expr t) (Expr t)
  | Literal t

It is Foldable1. Now if we convert it to an unfixed representation we get

newtype Fix f = Fix (f (Fix f))

data ExprF t r 
  = Add r r 
  | Mul r r 
  | Literal t

type Expr t = Fix (ExprF t)

where ExprF is Bifoldable1.

Iโ€™m in favor of both the interface as per #9 (comment) and adding Bifoldable1 in one go.

@chessai @cgibbard @emilypi what is your take?

@chessai @emilypi @cgibbard just a gentle ping.

I will respond Monday.

@emilypi @cgibbard a kind reminder about the question in #9 (comment). I'm keen to make some progress before everyone is off for festive season.

@chessai @emilypi @cgibbard could we please make progress on this?

I don't think I'm a roadblocking decision maker. That said...

  1. I think Foldable1 looks great.
  2. I'm not knowledgeable enough to comment on Bifoldable1. If I had made an effort to utilize Bifoldable structures more, I might have some insight. Unless someone can identify a method to add to Bifoldable1, I think it's sufficient for now.

Iโ€™m in favor of both the interface as per #9 (comment) and adding Bifoldable1 in one go.

@chessai @cgibbard @emilypi what is your take?

+1 from me

I assume that @cgibbard and @emilypi silently agree with the plan ;) This now awaits for someone to raise an MR for CLC to vote.

I don't have a very strong opinion either way on this, but it seems fine.

+1

+1

For anyone interested in making a MR, there's haddocks for a compatibility package of what I think is the most agreed implementation of Data.Foldable1 and Data.Bifoldable1 modules atm at https://oleg.fi/haddocks/foldable1/. The sources are at https://github.com/phadej/foldable1

@phadej , Thank you for your tireless efforts throughout the Haskell ecosystem. This is great work.

Dear CLC members, let's vote on https://gitlab.haskell.org/ghc/ghc/-/merge_requests/8495 which adds Data.Foldable1 and Data.Bifoldable1 to base. The design of the type classes matches #9 (comment), which we has approved above. The source code is extracted from https://github.com/phadej/foldable1, where you can find tests and benchmarks as well.

@tomjaguarpaw @chessai @cgibbard @emilypi @mixphix


+1 from me.

+1

+1

-1


Foldable is a poorly-designed class (explanation elaborated upon in another thread) and I'm against propagating its design into other classes.

@cgibbard @mixphix just a gentle reminder to vote.

+1

Altogether 4 votes in favour and 1 against out of 6 possible, which is enough for the proposal to pass. Thanks all, approved.

I couldn't decide whether to vote +1 or -1 on it and realized that it wouldn't change the result anyway just as voting closed. :)

I'm still not sure this needs to be moved to base, but if people feel it does, I suppose it's fine.

fwiw @tomjaguarpaw i think you are correct in the long run, but that would be a separate proposal with less of a migratory burden than this one. This one accomplishes roughly a part of that migration that we'd need to do anyway.

I'm trying to summarise the state of this proposal as part of my volunteering effort to track the progress of all approved CLC proposals.

Field Value
Author @gwils, @Bodigrim
Status merged
base version 4.18.0.0
Merge Request (MR) https://gitlab.haskell.org/ghc/ghc/-/merge_requests/8495
Blocked by nothing
CHANGELOG entry present
Migration guide not needed

Please, let me know if you find any mistakes ๐Ÿ™‚