PawelGerr/Thinktecture.EntityFrameworkCore

Highly increased memory usage after migration to net 7

Stocki916 opened this issue · 6 comments

Dear community,

i'm developing an import tool, which selects data from a microsoft sql server and inserts the results to a no sql database (ravendb). To select the desired data on sql server side, i'm using "thinktecture temp tables" and inserting the ids of data which is needed to import.
Until two weeks ago, the import tool was developed in net 6 and uses "ef core" in version 6.0.12 and "thinktecture" in version "4.5.1".
During the import the memory usage was round about 2GB (depends on amount of data, but between 1,5 and 2GB was normal).

After migration to net 7 (7.0.4) and "thinktecture" version 7.0.0 and also 7.1.0 the import tool uses more than 9 GB (sometimes up to 14 GB).
I removed all used nuget packages and reduced code, to only use "ef core" and "thinktecture". So the highly memory usage must be caused by one of these libraries.
Is this a known issue or are there any thoughts about it?
I could also stay on net 6 for the moment, but would like to update within the future to use some new features :-)

Hi, can you provide a minimal repro? I'd like to analyse this case with a memory profiler.

Hi, sure, i will try to create a minimal repo for testing. But without data or also with any demo data in it?

Maybe you can generate some random data before actual test?

Hi,
its not that easy to build a suitable test demo at the moment but i did some further investigations and have a better understanding that might also help.

We fill a temp table with ids and use this temptable for joining.

` var tempTable = await _context.BulkInsertValuesIntoTempTableAsync(ids, cancellationToken: cancellationToken);

        return await _context.Items
            .Join(tempTable.Query,
                p => p.Id,
                sIds => sIds,
                (p, sIds) => p)
            .Select(s => s.Id)
            .Distinct()
            .ToListAsync();`

This demo is running without high memory usage, but when i pass a "cancellation token" to the "ToListAsync()" method, then the memory usage raises extremly high!

` var tempTable = await _context.BulkInsertValuesIntoTempTableAsync(ids, cancellationToken: cancellationToken);

        return await _context.Items
            .Join(tempTable.Query,
                p => p.Id,
                sIds => sIds,
                (p, sIds) => p)
            .Select(s => s.Id)
            .Distinct()
            .ToListAsync(ct);`

In our import project, the importer takes 2GB memory without provided cancellation token and 18 GB with cancellation token. In the memory debugger in visual studio, i can see that there seem to be a lot of parallel threads running to database, each with cancellation token and a "callback" node.
I guess between net6 and net7, the thread handling was changed and now many more threads are used than before.
RavenDataDistribution_WithCancellationToken

But I'm not sure, if this can really be controlled by "thinktecture" or if its a ef core design?

Ok, pretty easy to test: i removed temp table and just made an "async" request to database and select large set of data and also without any temp table the used memory increases very high :-)
So it's not an thinktecture issue and i need to concentrate on ef core

A memory leak in Microsoft.Data.SqlClient v5.0.1 and v5.1.0.
An update to v5.0.2 or 5.1.1 fixes the issue.

dotnet/efcore#30691