babel/babel

[Bug]: Assumption privateFieldsAsProperties w/o transform-runtime plugin can lead to private field name clashes

fwienber opened this issue ยท 36 comments

๐Ÿ’ป

  • Would you like to work on a fix?

How are you using Babel?

babel-loader (webpack)

Input code

When using the assumption privateFieldsAsProperties: true in combination with not using the Babel transform-runtime plugin (so that babel-helpers are inlined), it becomes possible to override #-private fields, which is incorrect semantics.

The issue can only be reproduced when using at least two different source files.

Bar.js:

export default class Bar {
  #hidden = "Bar's hidden";
  getHidden() {
    return this.#hidden;
  }
}

index.js:

import Bar from "./Bar";

class Foo extends Bar {
  // it should not be possible to override a #private field of a superclass:
  #hidden = "Foo's hidden";
}

console.log(new Foo().getHidden()); // actual: "Foo's hidden", expected: "Bar's hidden"

Since Babel's own REPL only supports one input file, here is the full example including a .babelrc in codesandbox:
codesandbox.io reproducer

Configuration file name

.babelrc

Configuration

{
  "presets": [
    "env"
  ],
  "assumptions": {
    "privateFieldsAsProperties": true
  }
}

Current and expected behavior

When running the codesandbox example given above, in the output you see "Foo's hidden", so class Foo managed to override the #-private field #hidden of class Bar, which must not be possible.
When you switch off privateFieldsAsProperties and run the example again, the correct result "Foo's hidden" is shown.
Also when you use the Babel plugin transform-runtime, the behavior is correct.

Environment

See codesandbox example:

  • Babel 7.2.0
  • Node 16

Possible solution

The reason that both #-private fields share the same name __private_0_hidden is that the helper function classPrivateFieldLooseKey tries to define and initialize a global counter variable id that is increased for each generated property name, to ensure unique private property names. However, when helper functions are inlined (no transform-runtime plugin), each bundled file comes with a copy of this helper code, each defining its own var id = 0;. Thus, in both bundled files, the private member counter starts with 0, ending up in the same private field slot name in both the superclass and the subclass, which must not happen.

One approach to solve this could be to use a Symbol instead of a generated string slot name, but I don't know if Babel helper code is allowed to use language features that may not be available in older target environments. I tested this approach and it works perfectly (given that Symbol is supported).
I guess helper code is not compiled, so using Symbol would rule out IE, unless a Symbol polyfill is applied. If this is a show-stopper, a new assumption privateFieldsAsSymbolProperties could be introduced that fixes the issue, but requires a target that supports Symbol (at least through a polyfill).

Additional context

No response

Hey @fwienber! We really appreciate you taking the time to report an issue. The collaborators on this project attempt to help as many people as possible, but we're a limited number of volunteers, so it's possible this won't be addressed swiftly.

If you need any help, or just have general Babel or JavaScript questions, we have a vibrant Slack community that typically always has someone willing to help. You can sign-up here for an invite.

We could append Math.random() to the generated string, which is "unique enough" to avoid conflicts while still keeping the IDs string-based (which works in older browsers).

The main difficulty is that we need to patch Math.random() in our tests to be predictable, but I guess it's doable.

To give more context, the main motivation we consider switching to privateFieldsAsProperties is that #-private members simulated through WeakMaps are a nightmare during debugging and heap analysing.

Having a random part in the private member name would help against name clashes, but it would make the compiler output non-deterministic, which is not only a bad thing for integration tests (as you pointed out), but also for comparing compilation results, e.g. when testing different Babel settings or when trying to debug a Babel bug, and for readability/usability of the generated code.

If adding a Symbol polyfill (e.g. the one I linked above) is really not an option, what about establishing a counter that is global even in case of inlined helper functions? This could be achieved through a global property (globalThis.__BABEL_PRIVATE_FIELD_COUNTER or the like) that is initialized exactly once, or as a global function that acts as a counter (even better encapsulation). Another advantage: the compiler output would not change at all!

what about establishing a counter that is global even in case of inlined helper functions?

Then we might be able to do it directly based on id.
Perhaps we can use closures or something to avoid such interference?

Also I recommend using something like iife to avoid global effects, as far as I know more than one transformation creates global variables.

Also I recommend using something like iife to avoid global effects, as far as I know more than one transformation creates global variables.

#15136 (comment)
#7321

There should be some more.

what about establishing a counter that is global even in case of inlined helper functions?

Then we might be able to do it directly based on id. Perhaps we can use closures or something to avoid such interference?

Also I recommend using something like iife to avoid global effects, as far as I know more than one transformation creates global variables.

Not sure I got you right, but I think since the code is copied into every generated file, putting the id in a closure is not an option, there is no way around using some "global variable", i.e. some property of the global JavaScript object. Since we cannot use Symbols, it has to have some (cryptic) name.
What language features / ECMAScript level can we use in helper code? Arrow functions? globalThis? Do they run in strict mode (I guess so)?

var id = id || 0;
Will this work? (just guessing, I'm not familiar with this)

If helper code runs in strict mode, no. The right-hand side id would not be defined, raising an error in strict mode.

At least in our build, I see "use strict" in the generated JS files. Might depend on Babel settings, though.

Assuming in the context of helper code, this is the global object, my suggestion would be:

if (!this.hasOwnProperty("__PRIVATE_FIELD_COUNTER__")) {
  Object.defineProperty(this, "__PRIVATE_FIELD_COUNTER__", {
    value: (function () {
      var id = 0;
      return function () { return id++ };
    })()
  });
}
export default function _classPrivateFieldKey(name) {
  return "__private_" + this.__PRIVATE_FIELD_COUNTER__() + "_" + name;
}

The code uses no JS features (like arrow functions) that may be unsupported, except Object.defineProperty(), which I have seen in other helper code, so that seems to be okay.
Using Object.defineProperty() only with a value ensures that it is not enumerable nor configurable, so no-one can fiddle with our global counter function. If we wanted to avoid it, it could simply be replaced by this.__PRIVATE_FIELD_COUNTER__ = ....

Next I'll patch helpers.js, try it out in our workspace and report back here.

@fwienber Good idea! To save some bytes we can also make __PRIVATE_FIELD_COUNTER__ a getter.

Okay, so this did not work, it was somehow magically converted to (void 0) (explicitly converted to undefined). A bit of digging around led to the somewhat ugly, but robust solution (0,eval)("this"). Also adding the suggestion to use a getter leads to

  if (!Object.prototype.hasOwnProperty.call((0, eval)("this"), "__PRIVATE_FIELD_COUNTER__")) {
    Object.defineProperty((0, eval)("this"), "__PRIVATE_FIELD_COUNTER__", {
      get: (function () {
        var id = 0;
        return function () {
          return id++;
        }
      })()
    });
  }
  export default function _classPrivateFieldKey(name) {
    return "__private_" + (0, eval)("this").__PRIVATE_FIELD_COUNTER__ + "_" + name;
  }

I tested this in our workspace and it does the trick, fixing the "local counter" problem, without using any "modern" JavaScript in the helper code.

If you agree that this is the way to go, I'd start a PR with this solution.

What do you think about passing to _classPrivateFieldKey an hash of the current file path?

What do you think about passing to _classPrivateFieldKey an hash of the current file path?

Interesting, I didn't dare to ask whether such context information is available in the first place!
However, the current file path would be the path on the local developer machine, which would make the compiler output dependent on where the workspace is built.

If the fully-qualified module name, i.e. the name of the containing package and the relative file name of the containing module, would be available, together with the class name, we could use the hash of a name that would be independent of where the workspace is built. But even then, the compilation result would depend on a lot of context, so it would change when you rename the package or move/rename the source file, even if the content stayed the same. Furthermore, does Babel compile output consider package name or source file locations anywhere else (except module import/require resolution)? Finally, this approach sounds more complicated than simply using a global counter.

Okay, so this did not work, it was somehow magically converted to (void 0) (explicitly converted to undefined). A bit of digging around led to the somewhat ugly, but robust solution (0,eval)("this"). Also adding the suggestion to use a getter leads to

  if (!Object.prototype.hasOwnProperty.call((0, eval)("this"), "__PRIVATE_FIELD_COUNTER__")) {
    Object.defineProperty((0, eval)("this"), "__PRIVATE_FIELD_COUNTER__", {
      get: (function () {
        var id = 0;
        return function () {
          return id++;
        }
      })()
    });
  }
  export default function _classPrivateFieldKey(name) {
    return "__private_" + (0, eval)("this").__PRIVATE_FIELD_COUNTER__ + "_" + name;
  }

I tested this in our workspace and it does the trick, fixing the "local counter" problem, without using any "modern" JavaScript in the helper code.

If you agree that this is the way to go, I'd start a PR with this solution.

Personally, I think this is a bit dangerous, compared to the current problem.

Can you explain why this project directly merges the output of multiple files without using IIFE or bundler?

If we really need to change this, I'd be more inclined towards something like var id = id || 0;, which is also what esbuild generates for ts' enums.

Also would this work?
This is currently what ts and babel generate for enums, which should be widely accepted.

var id;
id = id || 0;

Personally, I think this is a bit dangerous, compared to the current problem.

๐Ÿ‘, eval breaks under most CSPs.

Can you explain why this project directly merges the output of multiple files without using IIFE or bundler?

The problem in this issue also happens without merging the files.

If you don't use @babel/runtime, the private helper is duplicated in every file and thus it starts counting from 0 every time.

Maybe we could wait until Babel 8 and change the helper to use Symbol, which of the various suggestions in this thread is the one I like more. There are less and less users relying on IE, and requiring them to load a symbol polyfill wouldn't be too bad.

Sorry, I see.

Can we define a non-enumerable property on a class?

Object.defineProperty(Bar, "__babel_class_id", {
       enumerable: false,
       value: {
         get() {
           return _classPrivateFieldLooseKey("hidden");
         },
       },
     });

Then during compilation, first try to use the __babel_class_id of the parent class.
I'm not sure if this will have an observable effect. (Perhaps we can remove it after internal initialization is done?)

If we really need to change this, I'd be more inclined towards something like var id = id || 0;, which is also what esbuild generates for ts' enums.

Like discussed above, this is not an option, as the code is executed in strict mode ("use strict"), where undeclared variables lead to an error.

๐Ÿ‘, eval breaks under most CSPs.

Okay, so there is no way to access the global object in Babel helper code?
What about a simple polyfill for globalThis (essentially window.globalThis = window)?

Sorry, I see.

Can we define a non-enumerable property on a class?

Object.defineProperty(Bar, "__babel_class_id", {
       enumerable: false,
       value: {
         get() {
           return _classPrivateFieldLooseKey("hidden");
         },
       },
     });

Then during compilation, first try to use the __babel_class_id of the parent class. I'm not sure if this will have an observable effect. (Perhaps we can remove it after internal initialization is done?)

We could try to give classes unique IDs and then use this ID as part of the private field name.
But that only shifts the problem from generating globally unique IDs for fields to generating globally unique IDs for classes, which leads to the same problem with multiple copies of inlined helper code.
And currently, the _classPrivateFieldKey() method only receives the field name as its single parameter, not the class to which this field will be added. Adding the class as a parameter would be a breaking change, I guess.

We could try to give classes unique IDs and then use this ID as part of the private field name.
But that only shifts the problem from generating globally unique IDs for fields to generating globally unique IDs for classes, which leads to the same problem with multiple copies of inlined helper code.

No, I mean try to get hidden id from parent class at execution time.
Something like this (just a crude example)
This doesn't require the id to be generated at compile time, but at execution time.
image
image

If we really need to change this, I'd be more inclined towards something like var id = id || 0;, which is also what esbuild generates for ts' enums.

Like discussed above, this is not an option, as the code is executed in strict mode ("use strict"), where undeclared variables lead to an error.

Oops, turns out I was wrong about this!

Even in strict mode, when using var (not let), the id on the right-hand side is already defined and has the value undefined. Also even in strict mode, it is allowed to declare a var multiple times. So it seems this approach actually works to declare a global variable and initialize it exactly once.

What I still would recommend is not using the name id for a global variable. The chances this name-clashes are high!

That said, it does not solve the problem. The inlined Babel helper code is not evaluated in a global context, but in some IIFE (to simulate a module).

I guess the only chance of establishing a global counter without being able to use "real" ES5 modules is to use a property of the global object, so we need to find some way to access the global object.

I know that modern JavaScript tries everything to avoid / disallow modifying the global object ("global scope pollution"), but unfortunately, in this case, for backwards-compatibility and to avoid a run-time dependency on Babel, we are stuck with "old" JavaScript.

@liuxingbaoyu If I get what you are trying to achieve correctly, you're attaching private name information at class-level. We could try to avoid name clashes at run-time, if _classPrivateFieldKey() receives the class object (for static fields; prototype for instance fields). Then, we could detect a name clash and increase a local counter until we find an unused slot name. But like I said above, this has impact on the API and as such leads to changes in all other helper code that uses _classPrivateField[Loose]Key() ...

core-js at least includes a polyfill for globalThis, moved to stable ES as of version 3.3.0. So I guess a babel helper could use it, unless another polyfill bundle is supported that does not include this polyfill.

One approach to solve this could be to use a Symbol instead of a generated string slot name

Can we emit Symbol as private field key in

init = t.callExpression(state.addHelper("classPrivateFieldLooseKey"), [
t.stringLiteral(name),
]);

when the class has superClass property? So we still use the debug-friendly string approach for the common case (a base class) and then fallback to the symbol approach when it is inherited from something.

The Symbol polyfill should work if we insert Symbol directly instead of doing it in the helpers. Essentially we are emulating the Symbol behaviour by a global id counter, so I figured maybe we should just use Symbol and let polyfill plugin take care of the rest.

Sorry, I was distracted by another topic.

Yes, @JLHwung, a Symbol polyfill sounds like the best solution.

The Symbol polyfill should work if we insert Symbol directly instead of doing it in the helpers.

Do you mean the solution would be to generate the Symbol(...) code as part of the transformation and hand it into the helper function, so the helper code itself would not use Symbol? Is the idea behind this that transformations can add polyfills, but helper code can not?

Okay, so I'll just go ahead and try out your approach.

when the class has superClass property?

Why only then? Given the dynamic nature of JavaScript, I'd suggest to always take care of #-private members not name-clashing. For example with certain implementations of mixins, such name clashes might happen even without inheritance.

I tried this approach and updated my draft PR #15393 accordingly.
It works just as well as my initial proposal to change classPrivateFieldLooseKey() to return
the Symbol.

The Symbol polyfill should work if we insert Symbol directly instead of doing it in the helpers.

To understand the difference between putting the Symbol(...) call in the helper code (my initial suggestion) and directly generating it (@JLHwung's proposal) - is the reason that it could be harder to include the polyfill before the helper code if the helper code is imported rather than inlined? AFAIK, polyfills should always be loaded very early, before the normal package loading starts, so I don't really see the problem here.

The new solution works, but has the disadvantage that a lot of compile output changes (I already updated the test output code in my PR accordingly). Of course, it also makes the compile output a bit smaller.

According to my understanding, the polyfill is injected on demand, and the assistant is not necessarily inserted directly into the code, it may also be a require or import, which will not import the dependent polyfill at all. (Even the inline code I'm not sure will be handled by the polyfill.)

So how can we test whether the polyfill is added automatically when required? The combination that would fail with the changes should be

  • privateFieldsAsProperties = true
  • transform-runtime plugin on, so that helpers come as external library
  • target = something outdated that does not support Symbol
  • of course the application code itself not using Symbol

Another thing is, in my PR CI, all tests are green but one: babel-old-version - babel-plugin-transform-unicode-regex - unicode-regex - negated-set fails with a diff. Can this be related to my change? make test is green in my local workspace, but I am not sure whether "babel-old-version" has been executed at all. Do I have to rebase on current main? Can I just retry this test in the CI? Any clue would be appreciated!

babel-old-version - babel-plugin-transform-unicode-regex - unicode-regex - negated-set fails with a diff

This is not related. It is from an upstream update. In e2e test the lock file is deleted to test against everything on the bleeding edge.

Of course I cannot test this in the Babel REPL, since Babel does not process Symbol at all, even when setting target = "ie 6". You cannot see whether polyfills would be included somehow magically, but it is definitely not added to the output like the other Babel helpers.

I did a bit of searching and this StackOverflow article sounds to me as if you have to take care of including (babel-)polyfills yourself if you really still want to target Internet Explorer or any other outdated environments. This is rather a question of how your packager (WebPack etc.) is configured.

So in the scenario given by the bullets above, using Symbol only would break if someone really made their application work in IE without including any build mechanism for automatically adding polyfills, which I can hardly imagine, or if they optimized-out the polyfill for Symbol manually because it was not used before - then I'd say it is their problem to fix that after a Babel update.

So I'd say it is safe to use Symbol in generated code.

Thanks @JLHwung, after your PR #15415 has been merged into main, rebasing on main did the trick and now all tests are green!

Anything missing? If not, what would be the next step to get this accepted / merged?