Reading the same schema from PostgreSQL takes a lot longer than from MSSQL

Question

Reading the same schema from PostgreSQL takes a lot longer than from MSSQL

Arnagos opened this issue 2 years ago · 4 comments

I was running my test suite and noticed that the PostgreSQL tests take A LOT longer than the MSSQL tests.
To verify this I wrote a short benchmark with BenchmarkDotNet and got these results:

MSSQL

Method	Mean	Error	StdDev	Median	Gen0	Gen1	Allocated
ReadAll	765.5 ms	18.91 ms	54.85 ms	750.3 ms	2000.0000	1000.0000	14.47 MB
AllTables	456.9 ms	12.77 ms	37.45 ms	457.6 ms	2000.0000	1000.0000	13.98 MB

PostgreSQL

Method	Mean	Error	StdDev	Gen0	Gen1	Allocated
ReadAll	2.835 s	0.0536 s	0.0638 s	1000.0000	-	7.03 MB
AllTables	2.653 s	0.0485 s	0.0454 s	-	-	4.48 MB

As you can see both ReadAll and AllTables take a lot longer for PostgreSQL (3.7x and 5.8x respectively).
Knowing the performance of PostgreSQL this result is really surprising to me and seems rather strange.

Based on my profiling the culprits seem to be the PrimaryKeys (1.2s) and ForeignKeys (1s) methods of the PostgreSQLAdapter.

Answer 1 · 2022-09-26T13:21:46.000Z

I took a quick look at the SQL used to load the constraints and think a faster solution to the problem would be to load all the constraints simultaneously instead of executing the same (very expensive) SQL multiple times. Every run of the constraints SQL takes 70-100ms, which results in the above numbers if applied to every single loaded table and constraint type. Loading all constraints at once on the other hand only takes ~200ms and results in a few hundred results that can then easily be filtered in C#.

Answer 2 · 2022-09-27T20:13:35.000Z

ReadAll internally calls AllTables, which internally calls TableBuilder.

TableBuilder does seeral calls on constraints- for the different types (primary key, foreign key, unique, check, null). For all, it does all tables - so it shouldn't be doing anything table by table.

For PostgreSql, 3 are the constraints use the same query (primary, foreign, unique), so yes, that is repeated, but maximum 3 times irregardless of number of tables. It may be possible to optimize this in the postgreSqlAdaptor class down to one, which is cached and reused between the (consecutive) calls- although you wouldn't want that cache to remain in memory.

Answer 3 · 2022-10-05T07:47:07.000Z

Do you have an idea, why PostgreSQL takes more than thrice the time of MSSQL? Ordinarily, one'd expect the opposite ...

Answer 4 · 2022-10-05T11:30:59.000Z

I guess the metadata tables are not optimized - for example, no indexes on schema or table name, so every query is a full table scan. Also sometimes the metadata tables contain vast amounts of system data- Oracle is a bad offender here, it is by far the slowest to read.