fscheck/FsCheck

Version 3.x, register Arbitraries directly with ArbMap

Closed this issue · 12 comments

Feature

I love the move to ArbMap in v3, over the thread-level registry.

It'd be great if we could directly register Arbitrary instances with ArbMap instead of needing an intermediary type with static members.

For example

let customArb = //...
ArbMap.defaults |> ArbMap.with customArb

This would allow users to simply define and use an Arbitrary all in one local scope.
This seems especially relevant now that ArbMap enables users to override type generation in a limited scope. It seems likely users would define an arbitrary only meant to be used in constructing some parent arbitrary.

Implementation Ideas

It seems TypeClass currently bears the responsibility for tracking the map of types to Arbitrary instances, and it tracks them as Map<InstanceKind,MethodInfo>.

In order to directly register Arbitraries, this map would need to change.

I haven't quite grasped TypeClass yet, but here's my idea.

  • It makes sense to me that ArbMap is responsible for tracking instances of arbitraries, maybe Map<InstanceKind,IArbitrary>.
  • TypeClass can be responsible for knowing what a match looks like / the light construction it seems to do
    • maybe arbMap |> TypeClass.getInstance type
  • Discovery then also probably moves somewhere else, and it outputs an ArbMap
    • maybe let arbMap = Discover.factoriesOnType<type> (or maybe it outputs a Map<Type, IArbitrary>)
    • If we don't want to invoke the factories right away, then we could create some internal lazy Arbitrary derivative that invokes the matched static factory on first call

I've investigated further and was able to create a pretty simple proof of concept.

Branch comparison

The bulk of it is this one method added to TypeClass

member this.MergeMethod(factory: System.Func<'TypeClass>) = 
    let typeClass = this.Class
    let toRegistryPair (m:MethodInfo) = // mostly a rip of addMethod from findInstances
        match m.ReturnType with
        | GenericTypeDef typeClass args when args.Length <> 1 -> 
            failwithf "Typeclasses must have exactly one generic parameter. Typeclass %A has %i" typeClass args.Length
        | GenericTypeDef typeClass args ->
            let instance = args.[0]
            (InstanceKind.FromType instance,m)
        | _ -> failwith "Lambda did not return a compatible type for this type class"
            
   let updatedMap = this.InstancesMap.Add(toRegistryPair factory.Method)
   new TypeClass<'TypeClass>(updatedMap, injectParameters, injectedConfigs)

This doesn't handle factories with parameters, but I thought I'd get feedback before tackling that.

This lambda-based approach causes less disruption or duplication than trying to keep a direct list of arbitrary instances. It also should allow the inline registrations to mimic and use all the same techniques as current factories registered on types.

MergeMethod could accept a MethodInfo instead of a Func to be more general. For example, it would let people directly register individual methods off existing types where they're keeping arb factories. I'd think some level (maybe ArbMap) should include wrappers that accept lambdas and covert to MethodInfo. That feels more intuitive for registering arbs constructed in the local context.

Thoughts?

That looks like a good start!

Since only F# support anonymous interface implementations, the "all in one local scope" only works for F#, or in C# when it's possible to create Arbitrary instances from existing ones (e.g. using Arb.filter), right?

I don't think so. The above implementation intentionally uses System.Func and not FSharp.Core.FSharpFunc. Func is the type involved when you see (foo) => bar in C#.

Func doesn't support partial application, but it should work with anonymous functions in either C# or F#. F# does some magic to make the conversion from F# functions to Func.

For example, this is the little test I added for MergeMethod translated to C#

public class TestImpl : ITypeClassUnderTest<int>
{
    public int GetSomething() => 1;
}

[Fact]
public void MergeFactoryShouldInstantiatePrimitiveType()
{
    var instance =
            TypeClass<ITypeClassUnderTest<_>>
                .New()
                .MergeMethod(() => new TestImpl())
                .InstanceFor<int, ITypeClassUnderTest<int>>()

    Assert.Equal(1, instance.GetSomething)
}

This could translate into local context registrations like

using ArbMap.Fluent;
//...
var subArb = //...
var arbWithCustom = ArbMap.Default.MergeMethod(() => subArb)
// now use the custom arb in a property or such

Yes, understood - I was just (perhaps pedantically :) ) thinking that in the C# case TestImpl is in the global scope, i.e. people can't keep the implementation local.

Ah. I see how I misread your earlier comment.

I think you're correct, but I also think the category of arbitraries made from other generators and arbitraries is a pretty significant category. The library offers a pretty robust set of building blocks.

Expanding to config parameters was easy enough, but I ran into some unfortunate issues while working on typeclass parameters.

  • Func cannot have generic type parameters. I.e. you can't have a Func<Typeclass<'T>, Typeclass<'T list>>, but you could register Func<Typeclass<int>, Typeclass<int list>>.
    • This means direct factory registration can't mimic a non-trivial class of existing arb factories
  • F# function arguments can't be converted to System.Func, and F# does not appear to infer Funcs from anonymous functions when they are passed to an F# function and not a type's method
    • i.e. let from0 (f: unit -> 'a) = (Func<'a>(f)).Method will throw an exception on invocation
    • let from0 (f: Func<'a>) = f.Method ... from0 (fun () -> 5) won't compile

It's not entirely clear to me what we should do about these issues. The second one can be worked around, it's just makes the api a bit awkward in F#. The first one seems like a fundamental limitation, and one not intuitive for end users.

Func cannot have generic type parameters. I.e. you can't have a Func<Typeclass<'T>, Typeclass<'T list>>, but you could register Func<Typeclass, Typeclass>.

Yes, because there's no "free" type argument to use to register those anyway. That was the main reason for having this whole registration process via methods (you can add generic type arguments to a method definition). I figured there isn't really much that can go wrong here from the user's perspective, i.e. the compiler will tell them that this sort of thing doesn't work.

It would be possible to define some real types that are interpreted by the registration mechanism as generic types, e.g.

type TGP1 = class end  // T Generic Placeholder
type TGP2 = class end
type TGP3 = class end

but I think that's too obfuscated and unusual.

F# function arguments can't be converted to System.Func,

I am perhaps missing something here - would it be possible to define an overload with FastFunc which I believe are F#'s function type.

I hadn't thought of the placeholder types. That's a clever solution, but I agree it's probably not very intuitive.
And yes, the compiler should generally warn against attempts at generic lambdas.

Hmm, I think F#'s function type is now FSharpFunc. FastFunc produces search results but doesn't seem to have a reference page. Maybe it got renamed?
The problem is that there doesn't seem to be a general way to get MethodInfo from an FSharpFunc (source, source). MethodInfo can be obtained for a System.Func though. Func has constructors that accept F# functions, but they lead to the System.Reflection.TargetException unless it's directly passed a lambda.

Here's some of the experiments I ran

type MethodInfoFromMethodParameter =
    static member from0 (f:Func<'a>) = f.Method
    static member from1 (f:Func<'a,'b>) = f.Method

module MethodInfoFromFSharpFunc = 
    let from0 (f: unit -> 'a) = (Func<'a>(f)).Method
    let from1 (f: 'a -> 'b) = (Func<'a,'b>(f)).Method

[<Tests>]
let funcConversionTests = 
    testList "" [
        test "Direct func" {
            let f = Func<int>(fun () -> 5)
            f.Invoke()
            |> ignore
        }
        test "From method argument" {
            let mi = MethodInfoFromMethodParameter.from0 (fun () -> 5)
            mi.Invoke(null, Array.empty<obj>)
            |> ignore
        }
        test "From F# func (fails)" {
            let mi = MethodInfoFromFSharpFunc.from0 (fun () -> 5)
            mi.Invoke(null, Array.empty<obj>)
            |> ignore
        }
    ]

I got more feedback clarifying the TargetException and how dynamic invocation works.

The FSharpFunc to Func conversion works just fine, if you also keep track of the func.Target. I was running into cases where I expected I was passing a static method, but F# wasn't figuring that out and still expecting a closure/target.

A straightforward solution to this would be swapping the Map<InstanceKind,MethodInfo> for a Map<InstanceKind, (obj, MethodInfo)> to allow for target information.

I notice that Discover has an override Discover<'T>(onlyPublic,instance:'T) that (I think) registers non-static methods from an instance. How is that currently working when GetInstance always passes null for the invocation target?

Hmm, I think F#'s function type is now FSharpFunc. FastFunc produces search results but doesn't seem to have a reference page. Maybe it got renamed?

Yes sorry, I think that was the old old name, apparently still in my muscle memory!

How is that currently working when GetInstance always passes null for the invocation target?

I don't think it works, doesn't look used either. It's likely a leftover from some earlier experiment.

I think changing the Map's values to tuple is fine though? Would need a few other changes to pass in null.

I think the work is ready for review. I've experimented with both the Fluent and FSharp APIs, and it seems to work as I expect.
I added tests for any errors I found.

The public endpoint names may need some tweaking

//Fsharp.ArbMap
let mergeFactory (factory: unit -> Arbitrary<'b>) (existingMap: IArbMap) 
let mergeArbFactory (factory: Arbitrary<'a> -> Arbitrary<'b>) (existingMap: IArbMap)
let mergeMapFactory (factory: IArbMap -> Arbitrary<'b>) (existingMap: IArbMap) 
let mergeArb (arb: Arbitrary<'a>) (existingMap: IArbMap)  
//Fluent.ArbMap
static member MergeArb<'T>(map: IArbMap, arb:Arbitrary<'T>);
static member MergeArbFactory<'T>(map: IArbMap, factory:Func<IArbMap,Arbitrary<'T>>) ;
static member MergeArbFactory<'T, 'U>(map: IArbMap, factory:Func<Arbitrary<'T>,Arbitrary<'U>>);
static member MergeArbFactory<'T>(map: IArbMap, factory:Func<Arbitrary<'T>>);

Completed by #613