phpstan/phpstan-doctrine

Performance issues after 1.3.54

derrabus opened this issue ยท 19 comments

Upgrading this library from 1.3.53 to 1.3.54 and beyond caused serious performance issues for us.

When running PHPStan with 1.3.x-dev of the plugin on my laptop, PHPStan's progress bar is stuck at 98% for 5 minutes, but the process completes eventually. With 1.3.53, PHPStan completes in 20 seconds on an empty result cache. On our CI, the PHPStan job timed out regularly which hasn't been a problem before.

With git bisect, I could identify a9d0aaf (#500, @janedbal) as the casue of this issue. If I revert this very commit on the 1.3.x branch, everything's back to normal. Our workaround for now is to pin phpstan-doctrine to the 1.3.53 release.

Unfortunately, I cannot publish the codebase I'm experiencing this issue on. I'm also a bit lost about how I could isolate a reproducer here, but I'm willing to answer any questions that might help someone investigating the problem.

Hi, this sounds like it gets stuck on a single file where the complexity explodes. If you run PHPStan with -vvv --debug you'll find that file. If you then simplify the file and other files this one calls (files with method declarations this file calls), you'll find the code pattern the extension gets stuck on.

Hey, I just reverted a9d0aaf and going to release that. I'd still appreciate if you were to point down the performance hog in 1.3.54.

I also experienced slowdown in real-world project but not as serious.

Idea: I can imagine similar behaviour, if the codebase would contain some really huge chain of methodcalls which return QueryBuilder. The differece of behaviour caused by this MR is best described in #499.

The way it works is that it deeply jumps to every method that returns QueryBuilder. And if that one contains method call that returns QueryBuilder, it jumps even there. So some huge chain of those might slow down the analysis significantly.

Instead of reverting, I think it should be possible to limit the depth of this chain @ondrejmirtes

With revert, nobody ever finds out where the problem resides and this significant improvement will never be present in this plugin imo

I'm usually for causing a bit of pain to force people to report problems so that the software can get better for everyone later.

But in this case I'm seeing 50 % slowdown with no change in analysis precision on a real-world 1M+ LoC codebase. So I decided to revert it.

People can still install 1.3.56 and debug where the slowdown really is. That's why I didn't close this issue.

I still think this brings significant value and (for example) adding an option to enable it (disable by default instead of revert) would be valueable for a lot of codebases.

Personally, I can copy the extension to our codebase to keep using it, but nobody else can benefit from that...

Just measured on our ~2M LoC codebase (600 doctrine entities). Compared to 1.3.56:

  • 1.3.57: slightly longer runtime, 52 new errors Could not analyse QueryBuilder (using reportDynamicQueryBuilders: true)

So I cannot debug the issue you are facing, but I can confirm it significantly helps in some codebases.

So I'd really appreciate any example test / repo where I could detect the core issue @derrabus

I tried to implement some caching in #518 (which might help a lot in codebases where you some chained createBaseQueryBuilder is called ton of times).

If somebody could confirm that it helps/does not help, it would be great. Thx

Please test 1.3.x-dev again, there's an improved version.

I've tested 1.3.x-dev (03b7d3f) on our codebase and there's no measurable difference to 1.3.53. Thank you very much for resolving this issue so quickly despite my poorly written bug report. ๐Ÿš€

Awesome, I'm gonna release this and we'll see if someone else finds degraded performance :) Thanks everyone.

mitelg commented

Hey there ๐Ÿ‘‹

I had the same problem after updating to 1.3.54 in https://github.com/shopware5/shopware
1.3.57 was okay again, but same problem with 1.3.58

After letting it run for some time this is the error output:

-- ------------------------------------------------------------------------------------------------------------------------- 
     Error                                                                                                                    
 -- ------------------------------------------------------------------------------------------------------------------------- 
     Reached internal errors count limit of 50, exiting...                                                                    
     Internal error: Child process timed out after 600.0 seconds. Try making it longer with parallel.processTimeout setting.  
     Child process error (exit code 139): Segmentation fault (core dumped)                                                    
                                                                                                                              
     Child process error (exit code 139): Segmentation fault (core dumped)                                                    
                                                                                                                              
     Child process error (exit code 139): Segmentation fault (core dumped)                                                    
                                                                                                                              
     Child process error (exit code 139): Segmentation fault (core dumped)                                                    
                                                                                                                              
     Child process error (exit code 139): Segmentation fault (core dumped)                                                    
                                                                                                                              
 -- ------------------------------------------------------------------------------------------------------------------------- 

So I am sorry to say, it is not fixed yet for us ๐Ÿ™ˆ

/cc @janedbal This is definitely the recursion problem I talked about.

@ondrejmirtes @mitelg Yeah, segfault is callstack overflow. Let me add some recursion prevention.

@ondrejmirtes Here it is #521

Gonna release the recursion fix shortly as 1.3.59. Please test it.

mitelg commented

works! ๐ŸŽ‰

thanks โค๏ธ no I have same issues to fix ๐Ÿ˜„

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.