Why are underscore prefixes for internal methods of a React component considered bad?
ttmarek opened this issue Β· 119 comments
The third bullet under react#methods says:
Do not use underscore prefix for internal methods of a React component.
I was just wondering what the reasoning behind this was?
They give people a false sense of "private"ness that could lead to bugs.
Okay. Even though JavaScript doesn't support private variables, I still think the underscores help communicate the programmer's intent. If possible, @lencioni could you give an example of this false sense of "private"ness that lead to a bug? Or maybe a scenario where the underscores would confuse another programmer and lead them astray. I'm not disagreeing with you, I just want to have a better understanding of how this rule came about.
Private means inaccessible. Your intent to privacy is irrelevant if the value is reachable, ie public.
For example, npm broke node once by removing an underscore-prefixed variable.
npm broke node once by removing an underscore-prefixed variable
If the variable wasn't prefixed they would have used it either way, and node would have still broke. When a meant-to-be-private variable is prefixed with an underscore there is at least one line of defence (a weak one, granted) preventing its use. Whereas when a meant-to-be-private variable isn't prefixed with an underscore, there is no line of defence (not even a weak one).
@ttmarek yes, but had they not prefixed it with an underscore, nobody would have been fooled into thinking it was actually private, and there would have been test coverage, and changing it would have prompted a semver-major version bump.
An underscore is in no way, whatsoever, a line of defense - it's just an incorrect assumption some developers make about a language that simply does not have the concept of privacy outside of closures.
@ljharb So, in other words: underscore prefixed methods might confuse programmers from other languages into thinking that JavaScript supports private methods. Right?
That, or programmers who think that convention is the same as privacy.
Okay, cool. Thanks @ljharb, I see what you're getting at. Do you think its worth adding a short blurb about this under the third bullet in react#methods?
I think so. I'd be happy to merge a PR that does that.
Sure, more helpful info is always good.
My favorite reason:
Private stuff tends to change without warning, and without advertising as a breaking change.
The problem with that is that a lot of code could break without warning, because EVERYBODY ignores underscores:
Newbies don't know what it means. Experienced developers think they know what they're doing and the underscore doesn't apply to them.
Admit it. You've done that, too.
I think @lencioni is perfectly correct by saying
They give people a false sense of "private"ness that could lead to bugs.
... but I'd like to elaborate on why.
In programming languages in general, the point of having a private property (as opposed to having all properties public) is so that you can trust a private variable not to be changed by outside code, unlike public properties which you need to assume can be changed by outside code at any time.
If you're absolutely hell bent on using inheritance in JavaScript, you don't have private properties. It is not a feature of inheritance in JavaScript. Everything is public. The logical thing to do is therefore to treat every property as a public property (because it is) - your code needs to assume that it can change at any time.
You can call a property _myProperty or __DONTFUCKINGCHANGEmyProperty but that doesn't make that property private. It's public. Private does not exist in the object composition model that you have chosen. You're trying to reap the benefits of a feature that you do not have. If you write your code in a way that assumes that your public-property-with-special-name-convetion does not change, you're just fooling yourself. The fucker can change at any moment, just like any other of your properties.
(Sidenote: This is React, and the following is not an option there, but in general, unless you need to create more than 50000 items per second, I personally think that you should consider using factory functions (https://www.youtube.com/watch?v=ImwrezYhw4w) instead of inheritance for creating objects. They have real privacy and do not have this problem)
Hi everyone, I've landed here after searching for an ESLint rule for enforcing this convention. Since I haven't found any, I've written one myself. In case anyone is interested, here it is: https://github.com/buildo/eslint-plugin-method-names
@gabro there's no-underscore-dangle
which prevents it on identifiers - it seems like that would be a useful addition to the core rule, rather than a separate plugin. You should file an issue requesting it on eslint
itself :-)
@ljharb yes, I will. Not sure whether to restrain the scope to underscores or to keep it generic as it's now, but I guess we'll figure it out in the ESLint issue.
@ljharb FYI here's the proposal eslint/eslint#7065
I actually don't agree with this. I think underscore is a good way to indicate a private method. Even if the JavaScript runtime does not enforce it. However, given React's imperative style, it doesn't matter much. In 99% of the cases, most internal methods (other than the React callbacks) are typically private.
It's not actually private if it's not enforced - as such, it constitutes either a mistake or a lie.
The underscore convention has been successfully used in the python world for years.
It is not a mistake. Everybody knows it does not prevent access to the variable (except absolute novices that are quickly undeceived).
And it is even less a lie (!!!!). Even though I am not too sure what it means to "lie" when coding, I can tell that nobody is pretending it enforces anything.
It is is a convention. Nothing more and nothing less. A convention telling: "This is internal, you can access it, but this is not supported and you'll do it at your own risk."
You can like this convention or not, there are pros and cons and I don't think the point is to discuss it here, but don't call a whole community "lier" or "stupid".
Indeed, it is a convention - one that causes confusion by implying privacy where none exists. There are certainly many ecosystems that have used convention as a boundary, but in the JS ecosystem, we've learned that convention is not strong enough of a boundary. If it is possible, people will do it, and any damage caused is the fault of the software author - never of its users. "Use at your own risk" is a reckless and hostile thing to claim as a software author - I, and this guide, choose to instead treat our ethical obligation to our users with higher respect than that.
I believe this is subjective (so far nobody gave me any convincing arguments of the assessments like "not strong enough"). I don't think that I - and the others who disagree with these assessments β are being "reckless", "hostile" or disrespectful when we chose not to use workarounds to hide the internal state of our component to our consumers (I have not found a "private" JS pattern readable and handy).
Is it my responsibility if someone code is broken because he used a variable I told him he should not?
When you buy a electronic device, it is usually not recommended to open it. Is it Samsung or Sony's fault if you broke your laptop when trying to change welded chip? I don't think so, and they usually clearly state that doing such voids the warranty. You can do it if you want to hack it, but of course you may not complain if you broke it. Apple decided to enforce it a bit stronger by using non-standard screws. You may chose to do something similar.
The general argument is that internal state should be protected. To quote again the python community: "who are you protecting the attribute from?". I think that is just really a question of preference.
Yes, as an author, it's your fault if your users are broken, even if they disobeyed your instructions. It's your obligation to protect them from themselves - the "we're all consenting adults here" philosophy doesn't really work.
This thread is very interesting, I think it goes beyond the actual "underscore" issue.
In my opinion, most of the open source world works by asuming that "we're all consenting adults".
SemVer is a weakly enforced convention (lib author decides when to change major or not) and yet packagers rely their whole dependency-update-logic on the assumptions that the developer was careful enough to follow the versoning rules.
I understand that everything is publicly accesible in JS, but that doesn't mean it has to be part of the public API of your library. IMHO, if a consumer is using a piece of code that doesn't appear in documentation and/or has an underscore prefix, then he should be treated as a consenting adult.
@scarmuega Just override the rule. This style guide is strongly opinionated, in particular when it comes to that topic. After some thoughts this is actually fine, we need Crockfords. You don't have to take the style guide entirely, and should not take it blankly.
I'll just say that if we were all indeed "adults" in the sense you mean, nobody would be monkeypatching and modifying builtin prototypes such that it blocks evolution of the language or breaks others; nobody would be writing XSS or CSRF exploits; etc. semver is an entirely different beast, since it's not part of the language or the code - just the ecosystem.
Certainly one interpretation of semver is that "your public API is what you define it to be". My interpretation is "anything that is reachable, observable, and/or accessible". The former often causes intentional or unintentional breakage when not in a major version bump; the latter adds negligible maintainence costs yet avoids any breakage short of a major version bump. I always prefer to choose safety; you're obviously free to choose otherwise.
@QuentinRoy thanks for the tip. I didn't intend to sound critic of airbnb's style guide, all the contrary, I think it's awesome work. I just thought that the discussion was really interesting, so I wanted to give my opinion.
@ljharb I get your point and I agree. What I wanted to state is that our current open-source ecosystem, like it or not, already works assuming we are adults (npm left-pad chaos is a clear example). Maybe it'll evolve into something more "strict" in the years to come, I certainly hope so. BTW, why the tumbs down? You think my comment was inappropriate?
The thumbs down indicates disagreement; I disagree that "we're all consenting adults" is a beneficial or accurate policy, in open source or otherwise. Grownups set rules for themselves (called "laws"), they don't just "trust" that things will all work out.
@scarmuega while I appreciate the sentiment, the "we're all adult" argument falls short in my opinion.
Even adults make mistakes, and that's why we - as an industry - use tools to avoid such mistakes and enforce the "right way".
Compilers, linters, editors, you name it. We could assume we get everything right, but we know this is a lie so we take measures.
Semver is a particular unfortunate case in which tooling doesn't help since computers are not very good at semantics, the core of semver itself. As a mental exercise, suppose we had a way of automatically determining what "breaking" means: wouldn't you adopt a tool to check that nothing breaking happens in a patch version bump? I think it would be an industry standard, because we're adults, but still humans, and humans make honest mistakes when given the opportunity.
@gabro I would totally adopt a tool that enforced semver. I would also love if JavaScript allowed private members. In the meanwhile, we are left with conventions.
@ljharb I struggle to assume that treating all of your code as "public API" adds only negligible maintenance, but I trust your experience and trajectory, so I'll give it a try. Thanks for your insights.
@scarmuega I know. I am myself an advocate of the _
convention. But here is not really a good place to discuss it (I tried, it was not a pleasant experience).
@gabro I would argue breaking a rule or a convention is usually not something you do by mistake. But this is as hypothetical as your assumption.
The argument @ljharb just put forward is, as far as I remember, the only _
problem I have been reported that is actually founded on observations / evidence (people monkey patching prototype ending up blocking JS evolution). I find it bit far fetched when it comes to the _
convention though.
@QuentinRoy the monkeypatching thing isn't related to _
, it's an example of why the "consenting adults" argument is wrong.
@ljharb Yes. Though "consenting adults" implementation usually starts with clear understanding of what is supported and not supported. E.g. _
says you should not touch that, I let you free to break this rule, but the results are not guaranteed. I think monkeypatching host objects' prototypes is a slightly different matter because it was/is considered an OK practice. There is no clear contract, no convention, one may even argue this is supported. Hence, I am not convinced it is really a "consenting adults" problem.
Monkeypatching host objects has not been considered an OK practice in JS for many many years. There has basically always been a clear contract that you shouldn't modify objects you don't own.
I am not defending monkey-patching. But I feel like most of us once stumble on a guide showing how to extend String prototypes when we learned JS. I did. And after all, today's developers are commonly patching their environment with polyfills for non standardized features (IMHO, this is quite dangerous).
The underscore pattern is a direct result of using prototypal inheritance. In Javascript, a "private" variable is one that is declared within a function's scope, either a parameter or variable. Functions that are added to the prototype don't have access to such variables declared in a prototype's constructor, so the underscore pattern was born - a total violation of the encapsulation principle.
There is a really simple solution - don't use prototypal inheritance. It leads to confusion with the this
keyword, the underscore prefix hack and other bad, bad problems. Modules provide you with everything you need to construct complex systems without the need for inheritance, thus encouraging better use of scope and general encapsulation.
Don't get me wrong, prototypal inheritance has its uses, but it shouldn't be the default approach to building software with Javascript. The industry seems to encourage it (cough react cough TypeScript), but I always encourage people to use a more functional style and learn to use function level scoping more effectively.
@danderson00 response is spot on.
The industry seems to encourage it (cough react cough TypeScript)
Although React uses inheritance, I wouldn't categorize it as encouraging its usage in general. More information: https://facebook.github.io/react/docs/composition-vs-inheritance.html
So the argument boils down to since there is no mechanism to enforce the private access, let's not use a convention either? Am I right? That sounds rather absurd.
@omeid No, the argument boils down to "since there is no mechanism for private access in the language, any architecture that relies on private instance methods is automatically inappropriate for JavaScript."
That is interesting, so do you suggest checking the value of properties before use everywhere?
That is interesting, so do you suggest checking the value of properties before use everywhere?
The safe thing to do would be assuming all properties are public and accessible, which is the technical reality.
@gabro That is a safe assumption, sure, but what does it means in reality? Do you do validation before using every and any property?
My understanding is that Information Hiding and Encapsulation --by extension-- is a fundamental aspect of OOP programming with Classical Inheritance, something which is being heavily used in modern JavaScript --ignoring the fact that it is mostly just a facade over the underlying Prototypal Inheritance; this makes the whole let's forgo conventional encapsulation because ideally it would be enforced by the language pretty unreasonable, IMHO.
I agree information hiding and encapsulation are helpful.
However they are when they can be trusted, and currently JavaScript doesn't have a way of providing this level of trust.
Working around the lack of a language feature through convention seems like a bad idea, because we're trusting the users to behave, which is untrue in the general case.
Thankfully there's an ongoing proposal for truly private members in JS :)
@gabro so how do you suggest approaching something like this?
Here, the Engine has some internal information that is stored in a state store for convince. How can this be avoided?
class Engine {
// using flow type annotations to give context
_store: Store<engineState, engineActions>
/* some APIs here that use the store */
}
What is lacking in all arguments against use of underscore to denote private method is some or even any concrete examples of avoiding relying on internal state, which to be very honest seems like an impossible task.
A good design does not have a way to be broken.
If code breaks because you failed to harden it, it's your fault - even if the user did something malicious. This includes if the user passed you the "wrong" type of thing (and you failed to runtime typecheck it), if the user tampered with methods on your object, if the user mutates your prototypes, etc. Object.freeze
exists; closure-privacy exists; use these tools.
It's not that you can't have "internal" state - it's that that state is 100% fully public, and you are lying to yourself if you pretend it's "private". Just commit to that state being public, don't break it unintentionally, and you'll be fine.
Again, that sounds all reasonable but how is using a convention to indicate private state worst than no indication at all?
Also, what seems to be ignored is that privateness (and underscore prefix in this case) does not only serves as a mechanism for enforcement but also communicates the design (Γ la self-documenting code), in some languages you can even bypass the privacy mechanism, that doesn't suddenly makes the whole system useless.
A false sense of security is worse than none at all.
As I said above - in JavaScript, a language without instance privacy, a design with privacy is inferior and inappropriate.
@ljharb Could you provide an example of how you would approach something like this?
Here, the Engine has some internal information that is stored in a state store for convince. How can this be avoided?
class Engine {
// using flow type annotations to give context
_store: Store<engineState, engineActions>
/* some APIs here that use the store */
}
One possibility is that you don't avoid it. You just make store
a public property.
And then commit to the whole store implementation as part of your Engine API? How is that better than indicating that it is internal state, please do not rely on?
And following up on
@ljharb: Yes, as an author, it's your fault if your users are broken, even if they disobeyed your instructions. It's your obligation to protect them from themselves - the "we're all consenting adults here" philosophy doesn't really work.
Isn't indicting that the store is private and you should not rely on part of doing your protect the user job? instead of pretending that it is some sort of intentional API?
The problem seems to convey the meaning of store
being an internal API, that may be subject to change. In that case, I think documentation would serve a better purpose than prefixing with _
, which would give a false sense of enforced privacy.
Anyway, if for your team the _
convention works ok there's nothing preventing you to adopt it.
You can request people not rely on it all you want (which is what a convention does), but while JS lacks privacy, its already part of your Engine API - that's the point, it's fully public no matter what you do anyways.
If you don't want it to be part of the API, you can use a WeakMap in a closure only accessible by the class, or you can have some sort of public string ID that can be used as a lookup key to some other form of storage.
@gabro That is what we did, but it is unfortunate that many people forgo a useful convention because of dogma.
@ljharb The notion that conventions have no value is false.
you can use a WeakMap in a closure only accessible by the class, or you can have some sort of public string ID that can be used as a lookup key to some other form of storage.
That seems over engineering to me.
Sorry everyone for all the spam, it seems like this is an ideological concern and I don't think any conversation of pragmatism would help, so I am gonna drop it.
I appreciate you communicating your point of view @ljharb and @gabro; although a concrete example would have gone a long way. :)
@gabro That is what we did, but it is unfortunate that many people forgo a useful convention because of dogma.
dogma is a strong word. This is more of a different opinion on the outcome of conventions.
My opinion is that this particular one is rather weak and dangerous because it cannot be enforced and it may lead to unexpected problems where a user treats it as public and the author treats it as private.
a concrete example would have gone a long way. :)
I would personally go for something like:
class Engine {
/**
* The underlying Store.
* This is not meant to be used directly, and it may be subject to breaking changes.
* We recommend you use the API instead.
*/
store: Store<engineState, engineActions>
/* some APIs here that use the store */
}
@gabro Apologises if you find dogma too strong; no hard feelings. I respect your opinion but without a reason arguing against something that objectively communicates more consistently about a convention more than just a comment feels pretty dogmatic.
Just a comment:
class Engine {
/**
* The underlying Store.
* This is not meant to be used directly, and it may be subject to breaking changes.
* We recommend you use the API instead.
*/
store: Store<engineState, engineActions>
/* some APIs here that use the store */
}
Comment and a name that stands out like a sore thumb:
class Engine {
/**
* The underlying Store.
* This is not meant to be used directly, and it may be subject to breaking changes.
* We recommend you use the API instead.
*/
_store: Store<engineState, engineActions>
/* some APIs here that use the store */
}
Or consider this from the auto-complete list; something is not like the others:
move()
blackList(args...)
getState()
playerName
state
_store
That's a very good point, from an autocompletion-based exploration point of view.
However, for the reason exposed above, I still stand by it being a dangerous convention to foster and in general the design shouldn't rely on private members being private, unless properly enforced.
It is not ideal, but I fail to see the argument why it is worst than nothing? specially in contrast to using only comment to communicate privateness?
The rationale is essentially the one at the beginning of this issue: fostering the idea that _foo
means private can lead people to making the wrong assumptions (i.e. that this is enforced somehow), which in turn can lead to bugs.
Comments are a way of conveying meaning without implying any language feature.
Fostering the idea that _foo
means private results into people using it in a non-private way sounds like a paradox to me.
And if people are not going to follow convention, nothing is stopping them from returning arrays with forEach
replaced, or objects with other built-in methods changed; following the argument that if it is not enforced then it can't be trusted would render JavaScript more or less completely useless. No?
Fostering the idea that _foo means private results into people using it in a non-private way sounds like a paradox to me.
The contrary: fostering the idea that _foo
means private may get two users of the same code misaligned, one using it (since it's accessible), and the other one changing its behavior in a breaking way (assuming it's ok because it's private).
Assuming that users will be using it in a non-private way is actually the safe default.
And using only a comment to indicate that foo
is private prevents that?
I'm not advocating comments as a solution. However, comments (as other possible means) don't contribute to increasing the risk of false assumptions being made.
Conveying meaning through syntactic conventions is what I'm personally wary of.
Seems like we are going in circles now. Sometimes internal state is necessary and just like comments, an underscore communicates the intention. Forcing against this convention to protect the code base from programmers who lack basic understanding of JavaScript seems pretty unreasonable; because how far do you want to stretch that? As I mentioned earlier, programmers with poor understanding of JavaScript could rewrite one of the many fundamental object methods, or break other things in unimaginable ways.
Anyways, people defer to this Style Guide a lot, and I think I have made all my points, so next time this brings up in a team that I work with, I will point them to this issue instead.
My argument is that internal state (class-instance-private state, that is) is only necessary if your architecture is inappropriate for JavaScript.
Slippery slope arguments not withstanding, fostering a convention that X is Y, when it is not Y, is always harmful, because that convention will never reach 100% acceptance. Also, programmers with an expert understanding of JavaScript, not a poor one, will know that your "convention" is irrelevant, and that those properties are private.
@omeid Here's the gist:
Encapsulation is a technical problem with technical solutions that work reliably all the time, regardless of the people involved. Solutions use closures: constructor functions, factory functions, functional mixins, stamps, WeakMap (if you're feeling weak), etc...
The underscore convention is a people solution which requires every individual involved to be aware and on board -- or it doesn't work.
In other words, if you use the underscore convention, you take a solved technical problem and turn it into an intractable people problem.
Save yourself the trouble. Pick the right technical solution for the problem at hand (usually, it's a simple closure), and get on with making software.
My argument is that internal state (class-instance-private state, that is) is only necessary if your architecture is inappropriate for JavaScript.
React relies heavily on this. Is React, Angular 2, and you name it, Components inappropriate for JavaScript?
Slippery slope arguments not withstanding, fostering a convention that X is Y, when it is not Y, is always harmful, because that convention will never reach 100% acceptance. Also, programmers with an expert understanding of JavaScript, not a poor one, will know that your "convention" is irrelevant, and that those properties are private.
What does knowing that a convention is irrelevant means?
For example, it is by convention that people don't change built-in Object and Array methods --other than polyfils et al-- so what does this being irrelevant means?
Should we assume like .map
on an Array could be anything because convention doesn't anything?
Yes, 100% absolutely you should, if you want to be truly robust.
However, there's still a difference between "an API you are responsible for" and "a built-in API that others could tamper with" - again let's not slippery-slope this into a different discussion than it is.
In case some readers are unfamiliar with closures, JavaScript has real privacy built-in. Closures already support all the fancy OO features you're used to from OO languages -- we just do it a bit differently in JavaScript:
class Engine {
constructor(store) {
Object.assign(this, {
setState (newState) {
store = newState;
return this;
},
getState () {
return store;
}
});
}
}
console.log(
new Engine(2).setState(3).getState()
);
Yes, 100% absolutely you should, if you want to be truly robust.
Many high-profile hacker targets freeze builtins before loading 3rd party code for exactly this reason.
@ericelliott You argument fails to consider the fact that a lot of languages (Java, PHP, Ruby, Go, even C++, yup!) allows you to bypass visibility, so the idea that the only solution is one that is strictly enforced is unrealistic.
You argument fails to consider the fact that a lot of languages (Java, PHP, Ruby, Go, even C++, yup!) allows you to bypass visibility
Have you ever played the game "Lemmings"?
The fact that other languages let you do stupid things should not be used an excuse to do stupid things yourself.
This will all be a moot point if and when private props are added to the language. It's already at the proposal stage, and I am fairly certain we'll get them.
I'm also certain that they'll add complexity and edge-cases to object/constructor behavior and that you'll still be better off learning closures.
But at least you won't be using underscores, so that's something. ;)
@ljharb: However, there's still a difference between "an API you are responsible for" and "a built-in API that others could tamper with" - again let's not slippery-slope this into a different discussion than it is.
Then I suspect using someArray.map
should be consider as bad as an underscore? And one should use Array.map(someArray)
after making sure the first thing they do is freeze Array
?
@omeid the JS private field proposal explicitly will not ever allow you to bypass visibility - in JS, private will mean inaccessible in all cases, full stop.
Yes, if you want to continue the fallacious slippery slope argument, you should not blindly .map
off of arrays. However, again, that is an unrelated argument. You'll need to stop pursuing that path if you want to continue discussing things on this repo.
The argument against using underscore
seems to be that since it is not enforced it will create problems by people ignoring the convention, so I don't think it is unrelated to the fact majority of JavaScript programs are written with the assumption that people follow a bunch of conventions, including not changing built-ins.
@ljharb That is great. You see, I do want good visibility mechanisms, obviously, but I am kinda tired of people referencing this Style Guide as why _
is bad without any sound reasoning.
@ericelliott You have good points, but until you can tell me why
var isOpen = false; // private
Is better than
var _isOpen = false; // private
And how the earlier is harder to miss, we are just going on about theory, and we all know what that means in practice.
@omeid The very existence of this argument and disagreement is unshakable proof that conventions are not strong enough guarantees of data privacy. You can't even get everybody on board in a discussion thread. How are you going to do it in a large code base on a large team with contributors you don't even know or talk to?
Good luck with that.
I never said
var isOpen = false; // private
Is better than
var _isOpen = false; // private
I think they're both stupid, and the real solution is to just use a closure. Closures are the best guarantee available in JavaScript, and every JavaScript developer should be aware of them (should being the operative word).
I suggest that you drop the Java/C++ accent and learn JavaScript.
@ericelliott The first one is from one of your public projects. Clearly you feel like sometime people need to do stupid things, and I am just saying that there is always a slightly better way to do stupid things which communicates this stupidity more consistently so people don't miss it.
I suggest you stop making assumptions about my understanding of JavaScript, or any language for that matter and talk about the topic at hand.
@omeid If you're referring to the StampIt docs, that uses a closure for actual, real data privacy. There's no need for an underscore in that context because it can't be misused.
I'm not assuming you don't know JavaScript, but that should be immediately obvious to you if you do know it... right?
Yes, but I can't afford that in React components for example. The point is, that we need private state sometimes.
- HOCs support real data privacy in React components
constructor()
supports real data privacy in React components- Virtually all state libraries for React support real data privacy in React components
- react-stamp supports real data privacy in React components
All of those things use closures.
@ericelliott Alright, looks like I am gonna have to keep with overriding one more rule in my eslintrc, all the listed solutions seems overkill for something as simple as indicating whatever you're committed to an interface or not.
Thanks for the input everyone.
@ericelliott If I may, I don't think the fact that we are discussing about it is a proof it is an issue, i.e. a significant part of the community will not respect it and there code is going to be broken. There is thousands of successful projects relying on conventions (underscore in particular). I have seen very little cases where it caused issue (one to be precise). The python community heavily rely on it. I do not pretend to have your's or @ljharb's experience, but if we are talking about proof and the fact that it is an issue (as a researcher, I humbly know a bit on what makes a proof), the data I have seen massively indicate the opposite (I know @ljharb thinks python and JS should not be compared in their practice but I humbly still fail to see what would make JS so different here)...
There is question of styles however. These are (IMHO) mostly subjective (which does not reduce them in any way).
@omeid, as I said before, this style guide is (mostly) defined by @ljharb and airbnb and they can do whatever they want based on what they think is best. It is opinionated (though based on the opinion of a strong leader in the JS community), you do not have to agree with everything, and yes, rules are meant to be overwritten and I think you should do it if you want to.
@QuentinRoy Sure, but Airbnb is a well respect softwarehouse and this guide is something that a lot of people defer to; so much that every time this conversation of underscore comes up, someone blurts out but what about Airbnb? or some such; and when this happened yesterday I thought I would dig down to it to actually learn, what about airbnb after all? and obviously, I am not convinced, nor is anyone else, haha. But at least now I can just tell them, it is the 1Ki issue; one, zero, two, four, check it out. π
@omeid I'm not sure what's not convincing about it; and given that there's only 12 participants in this thread, it doesn't seem like there's really all that many people who aren't convinced.
we need private state sometimes
Until private fields lands in the language, the only way you can get private state is closures. Attempting to fake it by other means demonstrates an inappropriate architecture for JS, regardless of what other languages do.
@ljharb You argument is sound, but it is based on the notion that conventions provide no value, which in my opinion is pretty questionable, specially when there is a whole lot of useful conventions in the same context: writing JavaScript programs.
I understand that putting an underscore doesn't hide it from the outside, but it communicates an intent of what you're committed to and what not; to draw a parallel, in terms of API surface commitment, if people are going to ignore semver, then semver is also useless, but here we are, relying on a stupid convention that isn't strictly enforced by the runtime.
@ljharb There is only 12 people in this issue because a lot of people just take what is at face value and then a lot of people override it.
I have seen very little cases where it caused issue (one to be precise).
I have seen several during my career. "Several" seems minor, right? No big deal. The problem is, two of them (using underscores from libs) silently broke production applications and caused very expensive problems.
Let me tell you one of the stories (I have others, equally bad).
Of course our immediate reaction to things breaking on production was "OK, what changed!?"
And of course, the answer was "lots of things". So we started down the path of looking at what we changed to try to quickly diagnose a root cause -- but since our changes were not the problem, that was a dead end.
In the meantime, production is broken for 3 hours and our customer support agents are burning through hours, costing us money -- sales are impacted, costing us more money -- 5 software developers all with six-figure salaries are racing to find the solution, costing us more money -- customers are demanding refunds, you know the drill now. $$$$.
While this is going on, the only developer on the right track is busy stepping through the broken part of the code line-by-line. He discovers a fix for the issue we're aware of by tinkering with values until something works -- but it's a bandaid. He hasn't discovered the root cause, yet.
We think it's fixed and ship a hotfix. Which breaks something unrelated, but since the dangers of hotfixes are only loosely related to the topic at hand, I'll skip the fallout of that misadventure.
Then we discover another bug. It depends on the same library but the code does something completely different. We think it's unrelated, but maybe that last change caused another part of the code to break? We have unit tests, and those chunks of code seem completely unrelated, but we have to investigate... so another dev goes chasing down that path. In the meantime, I check out the deps and see that there are a few shared dependencies... maybe one of them had a breaking change!?
So I'm pursuing that lead while another dev is chasing down another, a third dev is doing a visual diff on the whole code base including dependencies, but it's noisy and it will take him hours to find it.
While I'm investigating, I'm tracing through and discover what the first colleague missed... he had a habit of auto-skipping libraries while stepping through code in the debugger -- normally a great time-saving technique. I had that turned off, and finally found the root cause.
We fixed the problem, but it took several hours.. critical bits of the app were broken in production during that span. In retro, we learned that the total cost of the misadventure was more than $20,000 USD.
Breaking changes in large, commonly used libraries cost the industry a whole lot more than that. Potentially millions.
To see this perspective, you generally have to have dev management experience so you learn to see the monetary cost of bad practices.
This one, while incident frequency tends to be fairly low, the cost of those incidents is unbounded, and potentially very large.
Because of the misdiagnosis, and the production hotfix, and the unrelated bug caused by the botched hotfix, and the fact that all of our tasks were interrupted to fix all of those things (a task interrupted takes twice as long on average and contains twice as many errors)... the memory sticks out in my mind and makes me perhaps over-wary of underscores.
But unless you're using closures, if you think you're immune, you're mistaken.
The problem with conventions is consensus.
The problem with the underscore convention is that it naturally resists consensus:
Private stuff tends to change without warning, and without advertising as a breaking change.
The problem with that is that a lot of code could break without warning, because EVERYBODY ignores underscores:
Newbies don't know what it means. Experienced developers think they know what they're doing and the underscore doesn't apply to them.
Underscores turn a solved technical problem (IMO, the solutions are quite simple, not sure what's so complex about using a closure), and turns it into an intractable people problem.
KISS applies, but those using underscores seem to be unaware of the complexity they're steering themselves into...
@QuentinRoy I define "inappropriate" as "relies on a feature that does not exist in the language"
@omeid conventions have tons of value! They serve well to communicate information between those who know the convention. What they do not do, is provide any form of security, enforcement, or encapsulation - and "limiting API surface area" is a security concern, not a communication concern.
@ericelliott That is a pretty unfortunate incident, but if you don't have reproducible builds or at least a rollback strategy you're in for a tough ride anyhow.
Underscores turn a solved technical problem (IMO, the solutions are quite simple, not sure what's so complex about using a closure), and turns it into an intractable people problem.
I am not convinced that using workarounds --complicated or not-- for something that can be simply enforced by linter is a good idea.
The example that you provided for the Engine and Store requires one to write all the methods in the constructor, which is pretty ugly and makes things like jsdoc and testing complicated, and assigning a getStore
method on this
which returns the store
from closure is pretty useless, because it leaks out the very same thing (store) that is meant to be encapsulated.
@ljharb: What they do not do, is provide any form of security, enforcement, or encapsulation - and "limiting API surface area" is a security concern, not a communication concern.
How is limiting the API surface a security concern on-itself? Just to clarify, we are not talking about IPCs here, encapsulation is anything but separation of concern.
And there is the allowAfterThis
option for no-underscore-dangel
for programmers who are too stupid to learn conventions and those who are too smart to follow conventions, and if someone is going to ignore linting errors, then any conversation about linting rules is moot.
@omeid What lint rule simply forbids ._prop
access? I have seen solutions where _
get automatically compiled as real private props (using various strategies, like scrambling the prop name inside the class so attempts at external access fail). I'm more OK with that, because it turns it back into a solved technical problem -- however, it's still not very JavaScripty -- and you're no longer really using standard JavaScript at that point.
The example that you provided for the Engine and Store requires one to write all the methods in the constructor
Yes...
which is pretty ugly and makes things like jsdoc and testing complicated
No -- it's private. There's no reason to describe it with JSDoc, and you should be doing black-box testing -- test only the public API, not the private implementation details, for the same reason it's unsafe to use private methods externally -- they're more likely to change and break than the official public API.
and assigning a getStore method on this which returns the store from closure is pretty useless, because it leaks out the very same thing (store) that is meant to be encapsulated.
Again, no, because, while superficially it looks like I'm returning the store
, the very thing we're trying to keep private, we could easily change the implementation or shape under the hood, update the implementation of getState()
, and everything keeps magically working.
class Engine {
constructor(state) {
const store = {
state
};
Object.assign(this, {
setState (newState) {
store.state = newState;
return this;
},
getState () {
return store.state;
}
});
}
}
console.log(
new Engine(2).setState(3).getState()
); // 3
Notice how the same public API keeps right on working, just as originally intended, even though the implementation details of our private store
have changed under the hood.
"Program to an interface, not an implementation." ~ Gang of Four, "Design Patterns"
How is limiting the API surface a security concern on-itself?
Limiting the API surface has the natural side-effect of limiting the attack surface. It's basically lesson 3 when you start to learn about hardening software against attacks, right after authentication and authorization.
I define "inappropriate" as "relies on a feature that does not exist in the language".
I think "relies on a feature that does not exist in the language" is a cause of inappropriateness, not a definition of it. If it is, then
fake [private fields] by other means demonstrates an inappropriate architecture for JS
is a bit tautological. The reason I am asking is because I am trying to understand the problem: what is the bad concrete thing that may happen if I use _
and why. I was looking for something like what @ericelliott gave a bit later.
Also again, I would like to emphasize that nobody is trying to fake anything and I think this is important. We are merely using a convention to indicate to consumers of our stuff that doing something, i.e. relying on the value of a particular label, is unsupported by our API.
@ericelliott, I entirely agree of course this increases the attack surface, but unless you're exposing an API to interact with your own infrastructure, attack from whom and against whom? If I build a slider and my consumer "attacks" the API by doing stuffs explicitly unsupported, he is attacking himself really.
I know @ljharb also disagrees with this, but I still consider that my responsibility is to be clear about how my thing is designed to be used, not to prevent people from "hacking it" in their own software (if they want to they will manage, that's open source anyway).
EVERYBODY ignores underscores
I have a hard time believing this. I mean this is false anyway because at least I do. What is the proportion of people who do not respects it, you know better than me, but I am humbly surprised it is so high. I understand people could do it because they don't know what it means. However:
Experienced developers think they know what they're doing and the underscore doesn't apply to them.
That looks like very dumb to me. You know this is unreliable and you still rely on it? You can blame me, but I will not take the responsibility if you put your cat in my microwave despite me telling you that you should not put cats inside my microwave. And I am still not planning to add cat sensors inside (at least not for this reason).
@ericelliott In my opinion it is less a question of consensus than awareness. What I mean is I don't care if my consumer agrees with me using _
or not. As long as he knows what it means, as far as I am concerned he can safely use my stuff and there is no problem. Where I think you do have a very good point though is that as I understand and from your (valuable) experience, it is not clear for many people in the JS community that _
means "don't touch me or you're on your own". I would have expected this population to be small enough to be negligible (maybe because I did python before JS) and that such mistakes would be more often that not caught during code reviews (that is easily spotted and eslint even has a rule for it), but that may be naive.
Also, thank you for reporting your experience. I have actually been looking for such examples. This indeed looks like a very bad case. You presented a lot the consequences of the issue, however little in how you determined the cause. If you are allowed to, could you flesh it out a bit more? Just to be clear I entirely trust you when you say _
was the cause, I just would like to understand why and to what extent. Where you accessing _
values? Any idea how the codebase ended up with that in the first place?
My point in asking this is that I will take responsibility for young developers not knowing what they were doingβbecause as I said, the contract needs to be clear and it is possible that the meaning of _
is not clear for the JS community. However I will not take responsibility for experienced guys consciously breaking the contract because deadline or something else.
What lint rule simply forbids ._prop access?
The allowAfterThis
option for no-underscore-dangel
.
No -- it's private. There's no reason to describe it with JSDoc, and you should be doing black-box testing -- test only the public API, not the private implementation details, for the same reason it's unsafe to use private methods externally -- they're more likely to change and break than the official public API.
But all the other methods that internally depend on this private stuff needs to be documented though.
Again, no, because, while superficially it looks like I'm returning the store, the very thing we're trying to keep private, we could easily change the implementation or shape under the hood, update the implementation of getState(), and everything keeps magically working.
Well sure, in the updated example, but it still suffers from the issue that any method that depends on the internal state would have to be defined in the constructor, or use a method that exposes this internal state, which again, in turn leaks this internal state.
"Program to an interface, not an implementation." ~ Gang of Four, "Design Patterns"
Which means you should distinguish between implementation details and intentional APIs.
Limiting the API surface has the natural side-effect of limiting the attack surface. It's basically lesson 3 when you start to learn about hardening software against attacks, right after authentication and authorization.
What sequence of lesson are you speaking of? last time I checked, security was not packed into a single all-in-one course, it is a discipline with far and wide aspects and concerns.
Anyways, attack surface does not apply to object interfaces that is intended to provide separation of concern, relying on object encapsulation for security/secrecy is absurd, it is intended to provide visibility.
Further more, it is obvious that using underscore does not provide any mechanisms for visibility, and is merely a conventional contract around API stability, if you're writing secure applications without understanding the basic constructs of the language you're using, I will be concerned about the people who might rely on such work.
But all the other methods that internally depend on this private stuff needs to be documented though.
Yes, but you're documenting and testing only their public API, so you can safely ignore the details of the internal stuff. That's what black box means.
Well sure, in the updated example, but it still suffers from the issue that any method that depends on the internal state would have to be defined in the constructor,
Yes, they would be defined in the constructor. In JavaScript, we call those "privileged" methods, because they do have access to the internal state for their implementation details.
or use a method that exposes this internal state, which again, in turn leaks this internal state.
There is no need to do that, because the right way is to use privileged methods.
"Program to an interface, not an implementation." ~ Gang of Four, "Design Patterns"
Which means you should distinguish between implementation details and intentional APIs.
Yes.
And no. In OOP, the canonical way to "distinguish" between implementation details and public APIs is to encapsulate the implementation details so you can't get at them at all from the public API.
The core idea of OOP -- the very essence of what OOP is -- the concept that objects are encapsulated, and that they control access to internal implementation details and private state.
OOP = encapsulation + message passing (method calls, etc...) -- the rest is just icing to help us structure relationships between objects (facilitate encapsulation and message passing).
If you're not doing encapsulation, you're not doing OOP.
If you remove encapsulation from the list, you've lost the essence of why OOP exists in the first place.
Quoting Alan Kay now (the founder of OOP):
"Object-oriented programming to me means only messaging, encapsulating and hiding state, and extreme late-binding of all things." ~ Alan Kay source
More in-depth:
"My original thought was to have something like recursive biological cells. We have about 10 to the 14th power of cells in our body. That is a hell of a lot more cells than there are nodes on the Internet. Those cells spend almost all of their effort keeping themselves normal. They're self-repairing, and you don't have to stop the organism in order to affect repairs. And then there are some interesting mathematical properties of this kind of thing that also occurred to me, and I called those things objects."
"So I thought of objects being like biological cells, only able to communicate with messages (so messaging came at the very beginning - it took a while to see how to do messaging in a programming language efficiently enough to be useful)."