OSGeo/PROJ

REGR: 7.2.1+ ~20x slower than past versions.

Closed this issue · 11 comments

From: pyproj4/pyproj#827

Example of problem

psedo code:

    PJ_CONTEXT* context = proj_context_create()
    PJ* pj = proj_create(
        context,
        "+proj=omerc +lat_0=45.0 +lonc=45.0 +k=1 +alpha=-45.0 +gamma=0.0 +units=m +ellps=WGS84 +no_defs +type=crs"
    )
    proj_destroy(pj)

Problem description

In PROJ<=7.2.0 appears to be ~20x faster than PROJ 7.2.1+. Based on a git bisect, the performance change appears in #2474.

Environment Information

  • It appears to impact both macOS and Linux

Installation method

  • conda & from source

There is likely a missing element in the scenario you mention. If I run the following snippet, proj.db is not opened at all and it runs in 1.2 second (so ~ 0.1 ms per call)

#include "proj.h"

int main()
{
    for(int i = 0; i < 10000; ++i )
    {
        PJ_CONTEXT* context = proj_context_create();
        PJ* pj = proj_create(
            context,
             "+proj=omerc +lat_0=45.0 +lonc=45.0 +k=1 +alpha=-45.0 +gamma=0.0 +units=m +ellps=WGS84 +no_defs +type=crs"
        );
        proj_destroy(pj);
        proj_context_destroy(context);
    }
    return 0;
}

I tried with "EPSG:4326" as a string and it is much slower of course (~ 45 s for 10,000 iterations), but there is no significant difference if running checkDatabaseLayout() or not.

If I run the following snippet, proj.db is not opened at all and it runs in 1.2 second (so ~ 0.1 ms per call)

I am assuming this is with PROJ 7.2.1+. How fast is it with PROJ 7.2.0?

I am assuming this is with PROJ 7.2.1+. How fast is it with PROJ 7.2.0?

I tested with 7.2.0, the head of 7.2 branch (~ 7.2.1) and master. proj.db isn't opened at all, so #2474 is not relevant

Sounds like I may have made a mistake with the bisect. Feel free to re-do it and see what you find.

Feel free to re-do it and see what you find.

I will not find anything since I have the same performance for all versions :-) I guess your use in pyproj must do something in addition to your pseudo code

Ah, that makes sense. These would be the lines pyproj adds that you could add for testing:

 proj_context_use_proj4_init_rules(context, 1)
 proj_context_set_autoclose_database(context, 1)

It also calls proj_context_set_search_paths to set the path. I wonder if this is where the DB connection occurs?

proj_context_set_autoclose_database(context, 1)

That's the key to reproduce: it does indeed cause a connection opening later in the calls done by proj_create()

Thanks @rouault 👍

Is this regression fixed in PyProj>=3.2.1? I haven't been able to use PyProj>=3.0.1 for months because of this serious performance problem.

I can't exactly tell from the history on this thread.

Is this regression fixed in PyProj>=3.2.1?

Should be. Have you tried it out yet to verify?