microsoft/TypeScript

Pure annotation in downlevel emits

jods4 opened this issue ยท 29 comments

jods4 commented

Minifiers like Uglify try to remove dead code from minified builds.
To completely remove a declaration, they have to prove that it is side-effect free.
Unfortunately in JS this is often not possible as pretty much anything can have arbitrary effects.

An important issue is that ES5 emit for class can't easily be determined to be side-effect free.
As a result, tree-shaking is ineffective. In simple words: no class is removed from the output, even if it is never used.

A very simple solution could be to have a hint for the minifier, inside a special comment.
If the emit added a /** #pure */ comment at the beginning of ES5 classes it could easily be detected and removed by Uglify.

var V6Engine = /** #pure */(function () {
    function V6Engine() {
    }
    V6Engine.prototype.toString = function () {
        return 'V6';
    };
    return V6Engine;
}());

This is an important enhancement that can dramatically reduce the size of unused code, especially in large libraries like Angular or Aurelia.

Is this something the TS team would consider?
Original discussion in Uglify: mishoo/UglifyJS#1261

I am aware of #3882 and #7770 and this is not the same.
Those issues are about extending the language to include some "pure" annotations.
This issue is just about adding a hint (comment) inside the emit template.

If uglify adds this support we can consider emitting the comment. I am assuming this does not apply to classes with static intializers or decorated ones.
I am also assuming this applies to enums, and namespaces with only "side-effect-free" class declarations

jods4 commented

Thanks. We'll see what happens but whether TS would be willing to add the comment is a big factor.

The static initializers and decorators were brought up in the Uglify comments as well.
Currently TS emits those outside the class-defining function, so you wouldn't need to take them into account. This class:

@mammal
class Animal {
    static frob = 24;
}

turns into this:

// The class definition itself is pure
var Animal = /** #pure */(function () {
    function Animal() {
    }
    return Animal;
}());

// Uglify will need to decide what to do with the following (if it can drop them or not)
Animal.frob = 24;
Animal = __decorate([
    mammal
], Animal);

Static initializers and decorators are less common, so that's already a big step forward.

Enums and namespaces are also less common and smaller than classes so they were not discussed AFAIK. I guess anything that could help drop unused code is welcome.

@mhegazy Uglify now merged support for pure statements.

jods4 commented

For reference, the implementation details are here:
mishoo/UglifyJS@1e51586

kzc commented

For what it's worth, the pure function call comment annotation feature has been released in uglify-js@2.8.1:

$ echo 'foo(); var a=/*#__PURE__*/(function(){console.log("Hello");}()); bar();' | bin/uglifyjs -c toplevel
WARN: Dropping __PURE__ call [-:1,26]
WARN: Dropping unused variable a [-:1,11]
foo();bar();
jods4 commented

I was musing some more and while you're considering this issue I'd like to try to push it further.

Adding /*#__PURE__*/ in downlevel emit neatly solves the problem of tree-shaking classes in ES3/5.
But decorators are still an issue and unfortunately very common in frameworks like Angular and Aurelia since they provide a very convenient way to attach metadata.

Because it's impossible to prove that decorators are side-effect free (and in the strict sense they most often are not), any class that has a decorator cannot be tree-shaken, no matter if ES5 or ES6.

I think it is possible, with two changes:

  1. By injecting an additional /*#__PURE__*/ before __decorate. The trick is that it can't be always added like in the class codegen, because some decorators might have interesting side-effects.
    My idea is: if there's a #__PURE__ comment before the decorators in TS, add a #__PURE__ comment before the __decorate call.
// Turn this Typescript code:

/*#__PURE__*/
@cacheable
class Frob { }

// Into this ES6 emit:
var Frob = class { };
Frob = /*#__PURE__*/__decorate([cacheable], Frob);
  1. This is unfortunately not enough because Uglify does not have data flow analysis and the multiple assignments/usage of Frob prevents removing. But the emit could easily be changed to the following form with a single assignment:
var Frob = __decorate([cacheable], class {});

And this can be tree-shaken if decorated with a pure comment.

This comment from Uglify team explains the various patterns that work (or not) in more details.


Bonus round:
What would be awesome but more far-fetched is having the /*#__PURE__*/ comment on the decorators themselves and then adding it in front of __decorate if they all have it. That's a lot better from a user perspective but it's also not a local change anymore in TS.

I explored this for the use in Angular and can confirm that all downleveled classes previously retained by Uglify are being correctly removed if the IIFEs are prefixed with the annotation.

I'd love to see TS emit the /*@__PURE__*/ annotation by default for all downleveled ES classes.

@DanielRosenwasser please follow up with relevant teams to see what we can do

This could solve a considerable amount of size bloat problems for angular users attempting to minify their transpiled TS.

We have some reservations on the use of PURE, since it already has a different meaning in this context, see pure function definition in Wikipedia.

Would /*@__nosideeffects__*/ be an option here?

kzc commented

Would /*@__nosideeffects__*/ be an option here?

Uglify already supports /*@__PURE__*/ in line with its command line option pure_funcs. They are in use in the wild. It would be nice to support a single convention.

Barring that, could you have the typescript compiler allow the user to override the annotation string?

Uglify already supports /@PURE/ in line with its command line option pure_funcs.

I understand.

It would be nice to support a single convention.

I am saying this convention is wrong; pure functions already have a well defined meanings in the realm of programming languages, compilers and optimizers. using the same term to mean something different is misleading at best.
I am suggesting that uglify supports another annotation as well, let's call it /*@__nosideeffects__*/; since the implementation is already in place this should be rather trivial. they can then deprecate the /*@__PURE__*/ annotation, or not, up to the uglify team; but the TS compiler would only emit the /*@__nosideeffects__*/ which is really the check that is proposed here, making sure that the class has no observable side-effects outside defining a symbol.

kzc commented

the TS compiler would only emit the /*@__nosideeffects__*/

In the worst case scenario users can perform a string replace on the generated code to change the annotation string for uglify. If typescript releases support for /*@__nosideeffects__*/ uglify will consider supporting it.

I'm not a Typescript or Angular user, but they'd probably appreciate typescript rolling in #15857 at the same time as the class IIFE annotation.

I understand @mhegazy naming concern. I agree with his suggestion. I think at this point the situation is so bad that we'll take whatever string we can get from tsc and lobby at Uglify (or replace it if necessary).

#15857 is however currently more important because we don't have a workaround for it.

continuing the discussion with @DanielRosenwasser from #15857 (comment)

There is a question if it is actually safe in all cases to emit this annotation. I don't think it is, because the static initializers can have non-local side-effects. This is however very rare and I personally think it's a terrible anti-pattern.

I don't know how easy it is for you figure out non-local side-effects within static initializers. It seems doable, but getting it right has prerequisite of you understanding which functions in a program have no side effects (I believe there is an issue for that filed already).

I think it would be sufficient to initially have the #13721 emit be an opt-in flag that we'd recommend our community to use (and turn on by default in angular-cli). Once we understand the cases when it's not safe to emit the annotation, we could then consider making it the default in typescript. But again. Initially, having it be an opt-in is sufficient.

I don't know how easy it is for you figure out non-local side-effects within static initializers. It seems doable, but getting it right has prerequisite of you understanding which functions in a program have no side effects (I believe there is an issue for that filed already).

this is not a trivial probelm. and i would say it is not possible in the genral case, consider functions commming from a .d.ts file.
I would say if we support this, then we would error on the safe side. i,e. we would only emit this annotation iff:

  • class is not decorated
  • class has no static intializers
  • or static intializers all have constant expressions

@mhegazy I think that would be a good start, except for decorators. In Angular we strip decorators from production bundles completely, because all of our decorators are design-time-only.

@chuckjaz has a folding evaluator that allows us to extend the reach of static expressions, which is something you might want to consider, but I don't think it's essential for the first iteration.

@mhegazy I think that would be a good start, except for decorators. In Angular we strip decorators from production bundles completely, because all of our decorators are design-time-only.

In those cases ngc would also know which classes could be elided because it already knows which decorators to special-case.

My understanding is that the problem with only using ngc is that classes from 3rd party libraries (e.g. RxJS) can't easily be removed. But you could go for a joint approach:

  • ngc elides side-effect free decorated classes
  • TypeScript adds the /** @nosideeffect */ marker to undecorated definitely-side-effect-free classes.

Another way to look at this is that the user put the expression as a static initializer in a class for a reason. The intent is clearly that it is related to the class. It makes sense to interpret the intent as, only produce the side-effect of this expression iff the class is used. It is not that the code is side-effect free, it is that the side-effects, if any, are related to the class. The optimizer can assume that if no live references to the class are found outside the iffe then the marked iffe can be elided. It would also indicate that the side-effects can be deferred and moved to a lazy-loaded script if the references are lazy-loadable.

In a sense, this treats classes as sub-modules and the side-effect of loading the "module" can be deferred or elided depending on whether the "module" is loaded (i.e. referenced).

This would mean we need an annotation that states that, not that there are no side-effects, but, rather, the result of the iffe is encapsulated and the side-effects only matter if the result of the iffe is referenced. That is, the side-effects are encapsulated in the function result.

Fortunately, this is how Uglify treats /*#__PURE__*/ today. It doesn't imply it is actually pure but, rather, the expression is only considered referenced if the value is referenced.

Decorators can also be marked as encapsulated and then the tree-shaker would be able to remove or move them as well.

kzc commented

@chuckjaz I agree with you about treating the class as a sub-module irrespective of the side effects of static initializers and decorators.

If Typescript could be changed to preserve comments before function calls then there's a workaround for the static initializer and decorator side effect problem to achieve the desired effect: #15857 (comment)

@chuckjaz any recommendations for the annoation name?

Not in love with any I have come up with but /** @encapsulatedeffect */ or simply /** @encapsulated */ seem the best assuming the implied definition of encapsulated from my previous comment.

Maybe /** @delimitedeffect */ or /** @delimited */? Although that implies that there is some outer scope that delimits the effects, which is not the same as effects-if-referenced.

We discussed this int he last design meeting, and the conclusion was to use something declarative like /** @class **/. the new comment will be emitted for all classes, regardless of them being decorated or not.

--removeComents would cause the comment to not be emitted.

This small change has a fairly significant impact on our tests. While I'm not opposed to the change, do we want to always emit the /** @class */ annotation or should it be behind a flag?

Fixed, pending PR.

@rbuckton I'm having the issue #16727, where typescript removes my manual comment (/*#__PURE__*/). PR #16631 will fix that?

@AlgusDark I don't think it will improve on the surprising ways the compiler can remove comments, at least not in general.

It will however directly address the underlying feature you're trying to implement. Namely annotating each class so that Uglify can tree-shake it. After PR #16631 lands you can just have to replace /** @class */ with /*#__PURE__*/ to tell Uglify it's safe to remove. Hopefully we'll be able to customize it in Uglify rather than having to create a separate plugin or loader just to do this.

@Koslun the problem here is that I'm doing it in a function, not a class transpiled by TypeScript. So I believe that we need to check why compiler is removing that comment at the beginning of the function call, since that's something I want to keep.

My code is in a HOC from React:

const HOC = /*@__PURE__*/withHelpersModifiers(Component)
export default HOC;