fsharp/fslang-suggestions

"for-with" syntactic sugar for folds

reinux opened this issue · 2 comments

I propose we add syntax sugar to turn

let result =
  for s with t in ts do
    s + t

into

let result =
  ts
  |> #Collection.fold (fun s t -> s + t) s

The with keyword introduces the initial state, and the value inside the for expression is expected to be the same as the initial state.

The existing ways of approaching this problem in F# are

fold

Folds are widely used yet slightly unwieldy, especially for beginners but even for experienced users. Nested folds tend to get really messy if we aren't extra careful with formatting. There are also several ways to write them: (state, list) ||> fold f, list |> fold f state, fold f state list. Each style has its detractors, often not without good reason. Only the first offers type inference for both state and `list within the function, but is a somewhat obscure approach.

let rec f s ts = ...

Ad-hoc recursive functions are a bit verbose, so we don't use them unless we really need to.

let mutable s in for t in ts do s <- ...

Finally, there probably isn't anything wrong with using a mutable accumulator, but it's just not something people like to reach for because it feels icky.

Pros and Cons

The advantages of making this adjustment to F# are

  • Conducive to useful type inference (because s is specified prior to the function)
  • Much easier for new FP users to understand without sacrificing immutability
  • Better legibility for experienced users in many situations, e.g. when nested
  • Alleviate the need for closing brackets at the end of the fold function or for trailing arguments, which tend to be a bit unsightly following long funs
  • Natural addition to the for loop

The disadvantages of making this adjustment to F# are

Yet more syntax to document and learn. On the other hand, it seems quite intuitive and hence easy to remember.

I don't think it should require any changes to the API for CEs, but I could be wrong.

Extra information

Estimated cost (XS, S, M, L, XL, XXL):

M (please correct me if I am wrong)

Related suggestions: (put links to related suggestions here)

Perhaps we can have a yield version that translates into a scan:

let results =
  [ for s with t in ts do
      yield s + t
  ]

There is some slight ambiguity here, as yield is no longer a required keyword when the for is used in a sequence/list/array. Some users may expect for-with to behave as a scan even without the yield keyword if it is within a list.

Please let me know if I should open another issue for this suggestion, and I will update with the link here.

Affidavit (please submit!)

Please tick these items by placing a cross in the box:

  • This is not a question (e.g. like one you might ask on StackOverflow) and I have searched StackOverflow for discussions of this issue
  • This is a language change and not purely a tooling change (e.g. compiler bug, editor support, warning/error messages, new warning, non-breaking optimisation) belonging to the compiler and tooling repository
  • This is not something which has obviously "already been decided" in previous versions of F#. If you're questioning a fundamental design decision that has obviously already been taken (e.g. "Make F# untyped") then please don't submit it
  • I have searched both open and closed suggestions on this site and believe this is not a duplicate

Please tick all that apply:

  • This is not a breaking change to the F# language design
  • I or my company would be willing to help implement and/or test this

For Readers

If you would like to see this issue implemented, please click the 👍 emoji on this issue. These counts are used to generally order the suggestions by engagement.

This is interesting. I have wondered about something like this before...

Re: adding something like this to the language

For something like this to work in F#, there would need to be some type-based and/or structural definition of "foldable" as there currently is for enumerables.

Even then, types would actually need to conform in their definitions to whatever convention was decided upon — unless this RFC were implemented: https://github.com/fsharp/fslang-design/blob/cd6085fb9f3a50093938d616fde8776d3be2cdad/RFCs/FS-1043-extension-members-for-operators-and-srtp-constraints.md

An alternative approach could be to borrow the idea from C# of using some kind of attribute to tie modules and their fold functions to their corresponding types. C# does this to find the appropriate Create method to use for collection expressions. Again, though, any existing modules would need to be annotated.

Maybe you could use a heuristic like "look for a fold function operating on type T on a module in scope called T or TModule," although that seems like a pretty messy way to go about it.

Computation expressions

It's already possible to implement something like this yourself using computation expressions. All of the problems above still prevent you from writing a more general computation expression that handles any "foldable" type, though (see, e.g., https://github.com/Savelenko/FSharp.Control.Fold?tab=readme-ov-file#making-your-data-types-compatible-with-the-library).

I can think of a few other syntaxes off the top of my head.

Here are some very rough proofs of concept: https://gist.github.com/brianrourkeboll/830408adf29fa35c2d027178b9f08e3c

  1. This might be the most consistent:

    fold state { for x in xs -> fun acc -> f acc x }
    let sum xs = fold 0 { for x in xs -> (+) x }
    
    let rev xs = fold [] { for x in xs -> fun acc -> x :: acc }
    
    let rebuild m = fold Map.empty { for KeyValue (k, v) in m -> Map.add k v }

    I.e., you provide the initial state as an argument to the builder, and you then simply yield functions that take the state and return a new value.

  2. This is kind of nice, too, though?

    fold state { for acc, x in xs -> f acc x }
    let sum xs = fold 0 { for acc, x in xs -> acc + x }
    
    let rev xs = fold [] { for acc, x in xs -> x :: acc }
    
    let rebuild m = fold Map.empty { for acc, KeyValue (k, v) in m -> acc.Add (k, v) }

    I.e., you provide the initial state as an argument to the builder, and then the intermediate state is exposed to you as the first element of a tuple produced via for.

  3. I also kind of like this, although it starts to make less sense if you allow multiple loops, standalone yields, etc.:

    fold { for acc, x in state, xs -> f acc x }
    let sum xs = fold { for acc, x in 0, xs -> acc + x }
    
    let rev xs = fold { for acc, x in [], xs -> x :: acc }
    
    let rebuild m = fold { for acc, KeyValue (k, v) in Map.empty, m -> acc.Add (k, v) }

    I.e., you provide the initial state as an argument to in, and then the intermediate state is exposed to you as the first element of a tuple produced via for.

You can extend any of these to handle additional foldable types, including non-enumerable ones, although you need to do it one by one:

type FoldBuilder<'T> with
    member inline builder.For (option : _ option, [<InlineIfLambda>] body : _ -> FoldBuilderCode<_>) =
        FoldBuilderCode<'T> (fun sm ->
            match option with
            | Some x -> (body x).Invoke &sm
            | None -> ())

let two = fold 1 { for x in Some 1 -> (+) x }

type FoldBuilder<'T> with
    member inline builder.For (result : Result<_, _>, [<InlineIfLambda>] body : _ -> FoldBuilderCode<_>) =
        FoldBuilderCode<'T> (fun sm ->
            match result with
            | Ok x -> (body x).Invoke &sm
            | Error _ -> ())

let three = fold 1 { for x in Ok 2 -> (+) x }

I guess you could also do something like this, although again you'd still need to explicitly define extensions:

[<Extension>]
type FoldBuilderExtensions =
    [<Extension>]
    static member inline For<
        'Input,
        'Extensions,
        'Intermediate,
        'State when ('Input or 'Extensions) : (static member Fold : ('State -> 'Intermediate -> 'State) * 'State * 'Input -> 'State)
        > (
            _builder : FoldBuilder<'State, 'Extensions>,
            foldable : 'Input,
            [<InlineIfLambda>] body : 'Intermediate -> FoldBuilderCode<'State>
        ) =
        let folder sm x =
            let mutable sm = sm
            (body x).Invoke &sm
            sm

        let inline call folder state input = ((^Input or ^Extensions) : (static member Fold : ('State -> 'Intermediate -> 'State) * 'State * 'Input -> 'State) (folder, state, input))

        FoldBuilderCode<'State> (fun sm -> sm <- call folder sm foldable)

[<Extension>]
type FoldExtensions =
    [<Extension>]
    static member Fold (folder, state, input) = List.fold folder state input

    [<Extension>]
    static member Fold (folder, state, input) = Array.fold folder state input

    [<Extension>]
    static member Fold (folder, state, input) = Set.fold folder state input

    [<Extension>]
    static member Fold (folder, state, input) = Option.fold folder state input

    [<Extension>]
    static member Fold (folder, state, input) = ValueOption.fold folder state input

    [<Extension>]
    static member Fold (folder, state, input) = Result.fold folder state input

    [<Extension>]
    static member Fold (folder, state, input) = Seq.fold folder state input

let fold<'T> state = FoldBuilder<'T, FoldExtensions> state

let sum = fold 0 { for x in [1..100] -> (+) x }

let sum' = fold 0 { for x in Some 1 -> (+) x }

let sum'' = fold 0 { for x in set [1..100] -> (+) x }

let sum''' = fold 0 { for x in [|1..100|] -> (+) x }

Thanks for the detailed reply!

The CEs do indeed look nice. I probably wouldn't hesitate to use them myself if they were built-in, but I do have to wonder if it might be a bit magical-looking for beginners?

Maybe instead of unrolling to fold, we can translate it all the way to the imperative version, like this (using IEnumerable as opposed to ICollection too, btw):

let result =
  let mutable s = s
  let enr = ts.GetEnumerator()
  while enr.MoveNext() do
    s <- f s enr.Current
  s

We do lose some rigor, but my gut feeling is that that might not be so bad for a semi-imperative language like F#, as long as it's documented that that's what it's doing under the hood. Unless I'm missing something, it should be fairly rare that fold implementations do anything beyond that, so I think it shouldn't cause too many surprises.