6bee/aqua-core

TypeResolver.ResolveType getting slow with nested anonymous types

azabluda opened this issue · 5 comments

As I was playing with EntityFrameworkCore 2.0.0-preview1-final I ran into rather severe performance issues. Some of my tests would suddenly take a few minutes and even hours to execute. A bit deeper analysis hints at their new IncludeCompiler which tends to expand included navigation properties into subsequent SelectMany/GroupJoin more aggressively, generating a fairly large number of nested anonymous types on the way. One could certainly find their approach questionable, but I still think there exists a performance problem in aqua-core which deserves its own attention.

Apparently TypeResolver.ResolveType doesn't scale too well with the length of the chain of nested anonymous types. I wrote a little program which generates such types and measures the performance of TypeResolver

[Fact]
public void ResolveTypePerformanceTest()
{
    TypeInfo GenerateAnonymousType<T>(uint nestingCount, T value)
    {
        if (nestingCount == 0)
            return null;

        var newValue = new { Prop = value };
        return GenerateAnonymousType(nestingCount - 1, newValue) ?? new TypeInfo(newValue.GetType());
    }

    for (uint i = 15; i <= 30; ++i)
    {
        TypeInfo type = GenerateAnonymousType(i, "hello");
        var typeResolver = new TypeResolver();

        var watch = Stopwatch.StartNew();
        typeResolver.ResolveType(type);
        watch.Stop();

        Debug.WriteLine($"{i} | {watch.ElapsedMilliseconds}");
    }
}

As one can see the cost of resolving only one type is growing quite rapidly.

nestingCount execution time (ms)
15 55
16 56
17 88
18 117
19 199
20 319
21 525
22 792
23 1309
24 1963
25 3109
26 4943
27 8065
28 12996
29 21031
30 33822

The debugger suggests that it's mainly spending time computing collection hashes for nested anonymous types (TypeResolver.EqualityComparer.GetHashCode). In real world scenarios you usually have more than one type cached in a TypeResolver and you resolve types hundreds of times (e.g. when visiting Remote.Linq expression trees). Then the performance starts to suffer with much shorter chains of nested types, e.g. my EFC2 test never returned with type nesting depth being only 16.

Could you please take another look at TypeResolver.EqualityComparer from the performance point of view?

6bee commented

Great test!
Obviously GetHashCode has to be optimized for performance rather than accuracy.
Pushed version aqua-core 4.0.0-beta-003

I knew you'd like it :)

Unfortunately the performance problem has simply shifted from TypeResolver.EqualityComparer.GetHashCode to TypeResolver.EqualityComparer.Equals. I modified the test slightly for you to reproduce it more easily:

for (uint i = 15; i <= 20; ++i)
{
    var type1 = GenerateAnonymousType(i, "hello");
    var type2 = GenerateAnonymousType(i - 1, "hello");
    var typeResolver = new TypeResolver();

    var watch = Stopwatch.StartNew();
    typeResolver.ResolveType(type1);
    typeResolver.ResolveType(type2);
    watch.Stop();

    Debug.WriteLine($"{i} | {watch.ElapsedMilliseconds}");
}

still scales exponentially

15 | 1219
16 | 2747
17 | 5854
18 | 12747
19 | 28249
20 | 71767

I'm not sure but it seems we need to treat generic types (not necessarily anonymous ones?) in a way that only generic definitions are being cached. In the a.m. example this would lead to only two entries in the cache System.String and <>f__AnonymousType1, which should speed up type resolution.

6bee commented

Thanks for your suggestion. I'm going to consider this.
You need to be patient though, as I'm a bit short of time... Unless you're keen on sending a pull request.

Ok, I will look into this and let you know if I make any progress.

6bee commented

@azabluda Thank you so much for your valuable contributions 👍