Performance regression for collapse_vars=true with many var defs of complex objects
mlwilkerson opened this issue · 8 comments
Bug report or Feature request?
Bug: 55x performance regression
Version (complete output of terser -V
)
terser 3.7.6
Complete CLI command or minify()
options used
terser --compress collapse_vars=true --output output.js index.js
terser
input
Input is a module with many var defs, where each has a value that is a complex object. Original scenario was a webpack 4 bundle that had imported Font Awesome 5 icons.
This reproduction repo is a more minimal cleaned up demonstration.
Quick link to the input module from that repo.
terser
output or error
This version of terser takes about 30 seconds to produce the bundle, depending on hardware. The output is correct, but this is very slow, and becomes significantly slower in other similar real-world scenarios.
Expected result
This should build in <1 second, depending on hardware. I've seen the same code build correctly as quickly as 0.4 seconds in some previous versions of uglify-js
and uglify-es
that use the same collapse_vars
compression option.
I've timed my scenario against several snapshots of the terser
code base and found that the performance has changed significantly as work on compress.js:collapse()
has been done. For example:
- as of September 20, 2016 ~0.6 seconds
- as of April 18, 2017, ~1 second
- as of May 6, 2017, ~7 seconds
- as of October 8, 2017, ~33 seconds
- as of November 27, 2017, ~57s
@mlwilkerson Thanks for the report and self-contained repro.
This also affects uglify-js 3.4.0
.
Reported upstream: mishoo/UglifyJS#3174
@mlwilkerson Please try out this patch on this and larger real life scenarios and report your findings.
--- a/lib/compress.js
+++ b/lib/compress.js
@@ -1384,7 +1384,12 @@ merge(Compressor.prototype, {
extract_candidates(expr.alternative);
} else if (expr instanceof AST_Definitions
&& (compressor.option("unused") || !(expr instanceof AST_Const))) {
- expr.definitions.forEach(extract_candidates);
+ var len = expr.definitions.length;
+ var i = len - 200; // limit number of trailing variable defs for consideration
+ if (i < 0) i = 0;
+ for (; i < len; i++) {
+ extract_candidates(expr.definitions[i]);
+ }
} else if (expr instanceof AST_DWLoop) {
extract_candidates(expr.condition);
if (!(expr.body instanceof AST_Block)) {
Without patch:
$ /usr/bin/time bin/uglifyjs slow.js -c
function alpha(){"use strict";var obj42={alpha:"42",beta:"42",gamma:[42,42,[],"42","42"]},obj43={alpha:"43",beta:"43",gamma:[43,43,[],"43","43"]},obj44={alpha:"44",beta:"44",gamma:[44,44,[],"44","44"]};console.log(obj42.alpha),console.log(obj43.alpha),console.log(obj44.alpha)}alpha();
40.96 real 41.08 user 0.06 sys
With patch:
$ /usr/bin/time bin/uglifyjs slow.js -c
function alpha(){"use strict";var obj42={alpha:"42",beta:"42",gamma:[42,42,[],"42","42"]},obj43={alpha:"43",beta:"43",gamma:[43,43,[],"43","43"]},obj44={alpha:"44",beta:"44",gamma:[44,44,[],"44","44"]};console.log(obj42.alpha),console.log(obj43.alpha),console.log(obj44.alpha)}alpha();
0.58 real 0.66 user 0.04 sys
Minified output appears to be the same.
Thanks, @kzc. I'm on it.
@kzc I can confirm that this patch solves the problem I've reported. Supporting evidence:
Scenario 1: Webpack 4 bundle
- imports one icon each from three icon packs
- 875 var defs, shaken down to 3
- with the patch
real 0m0.969s
user 0m1.257s
sys 0m0.097s
Scenario 2: Webpack 4 bundle
- imports one icon each from two icon packs
- imports collection of all icon objects from a third icon pack
- 875 var defs, shaken down to 171
- with the patch
real 0m0.974s
user 0m1.244s
sys 0m0.090s
Scenario 3: angular-cli "hello world"
- imports one icon from one icon package
- ~hundreds of var defs shaken down to one
without patch: 145s
with patch: 14s
Scenario 4: real world Angular project
- tens of thousands of lines of code
- importing from three icon packs (on the order of 1,000 var defs)
- tree-shaken down to 51 used icon objects
without patch, deep imports: 55s
without patch, imports from main: 98s
with patch, imports from main: 56s
h/t @devoto13 for running some of these scenarios
@mlwilkerson Thanks for reporting your findings.
Speed and size aside, can I assume that all these minified programs ran correctly?
Yes, that's a safe assumption that they all ran correctly. In my many runs I've not always verified the correctness of the output, but any time that I have spot-checked (maybe 50% of the time), it's always looked correct (properly tree-shaken), and always executed correctly.
To be more certain, I did just re-verify the correctness of the minified programs for Scenario 1 and 2 above. Both ran correctly. I feel highly confident that Scenarios 3 and 4 would also be correct, but I have not verified. @devoto13 can you verify?
@mlwilkerson Yes, both bundles work correctly.
Thanks