caxy/php-htmldiff

Major decrease in speed after upgrading from v0.1.5 to v0.1.7

waltertamboer opened this issue ยท 10 comments

We have an application where we generate a comparison between two files. This file is pretty big and on v0.1.5 this took 21 seconds. After upgrading to v0.1.7 the comparison took 26 minutes.

After looking at v0.1.5...v0.1.7 - I suspect the multibyte functions to be the cause of this but I did not dug any further into this issue. There probably is some inefficient part in the code that causes this insane increase of time but I have not enough domain knowledge to pinpoint what's going on.

I'm more than happy to help out on this. Just let me know what I can do.

Hi Walter,

I am pretty sure you are right that the multibyte support is causing a performance penalty.
This was also noted when accepting the pull request, I think this is a trade of that has to be made.

My advice for now, make sure you are using atleast php7, maybe consider doing the diffing in a CLI or as a microservice.

Thanks for your advise Sven. We are indeed on PHP 7 and the diff is generated in the background via a Beanstalkd job. The problem is that it's unacceptable for our customers to wait this long so I'd love to find a solution for this.

@SavageTiger @waltertamboer Do we think adding in a config option to disable multibyte support is something worth doing?

I think that would be an option here that is something we could do immediately, but not the ideal solution - ideal solution would be to make some of the performance improvements that require larger-scale changes... those would take some time

Some users with timeout issues reverted back the the 0.1.5 with significant performance increases with the following hack:

protected function isOnlyWhitespace($str) { // Slightly faster then using preg_match // return $str !== '' && (mb_strlen(trim($str)) === 0); return $str !== '' && (strlen(trim($str)) === 0); }

Some profiling info that pin-pointed the suspected issue

https://www.drupal.org/project/diff/issues/3030979#comment-12964298

See PR: #81

Fixes from #81 are included in new release v0.1.9

Thanks! I'll try to test how these changes perform in our application asap.

Josh packagist still shows 0.1.8 as the latest version. Not sure why

Hm, the connector must be outdated. Will update!

@SavageTiger New release is now updated on packagist