tc39/proposal-record-tuple

Equality semantics for `-0` and `NaN`

bakkot opened this issue ยท 208 comments

What should each of the following evaluate to?

#[+0] == #[-0];

#[+0] === #[-0];

Object.is(#[+0], #[-0]);

#[NaN] == #[NaN];

#[NaN] === #[NaN];

Object.is(#[NaN], #[NaN]);

(For context, this is non-obvious because +0 === -0 is true, Object.is(+0, -0) is false, NaN === NaN is false, and Object.is(NaN, NaN) is true.)

Personally I lean towards the -0 cases all being false and the NaN cases all being true, so that the unusual equality semantics of -0 and NaN do not propagate to the new kinds of objects being introduced by this proposal.

Thanks for this clean write-up! I support @bakkot's suggestion.

I should mention that there is one other case for which equality is non-obvious: document.all == null. But presumably document.all should not be considered to be an immutable value for the purposes of this proposal, so it couldn't be inside of a tuple or record in the first place, and the problem does not arise.

One more case to consider: what should

(new Set([#[+0]])).has(#[-0]);

evaluate to? My inclination is false, for the same reason as above.

(For context, this is non-obvious because (new Set([+0])).has(-0) is true.)


I guess I should mention another possible solution, which is to say that -0 cannot be in a record or tuple, either by forbidding it entirely or by normalizing it to 0. I'm not a fan of either solution, though the latter has precedent with Sets: Object.is([...(new Set([-0]))][0], -0) returns false.

Tuples' equality should be defined as the equality of its contents. So #[NaN] === #[NaN] should be false, as NaN === NaN is false. That's because the language should not have more quirks than it already has, and confusing developing along the way (js already has too many equality comparisons)

Well, at least that was my first though, but I see some implications on this line of reasoning. Mainly, it is easy to check if something is a NaN (isNaN()) but it is not as easy to check contents of a tuple.

// worse
if (isNaN(tuple[0]) && isNaN(tuple[1]) && isNaN(tuple[2])) { }

// better
if (tuple === #[NaN, NaN, NaN]) { }

Considering that one of the aspects of this proposal improves the results of equality operations, I consider @bakkot approach the best.

How should we decide on this question? During the discussion following the October 2019 TC39 presentation about this proposal, we heard people arguing both sides of this debate. I doubt that this is the kind of thing that we'd get useful data about by implementing various alternatives, though, as one or the other semantics here are not all that useful.

Based on the discussion above (specifically the argument that we should try not to extend the unusual semantics that -0/NaN has, to more types), the champion group prefers @bakkot's suggested semantics, i.e.:

assert(#[-0] !== #[+0]);
assert(#[NaN] === #[NaN]);

I'll likely soon update the explainer with more examples to explain this, and link back to this discussion.

Zarel commented

I'm neutral to the suggested semantics for NaN, but I think enforcing #[-0] !== #[0] will lead to many subtle difficult-to-debug bugs. Currently, I would guess most JavaScript programmers don't realize that -0 and 0 are different, and the ones that do know to use Object.is when the difference is relevant.

With the -0, 0 unequivalence, most people working with them don't realize they're different. array[0] and array[-0] are the same, JSON.stringify(0) and JSON.stringify(-0) are the same, etc... as far as I'm aware, the difference only matters for 1/0 and Object.is, which aren't used by normal code. Forcing #[-0] !== #[0] would cause a lot of surprising bugs and frustration for programmers.

(This isn't a problem for the NaN equivalence, where anyone assuming #[NaN] !== #[NaN] would avoid comparing them, which would not lead to any unexpected bugs.)

I would advocate these assertions:

#[+0] == #[-0]

#[+0] === #[-0]

!Object.is(#[+0], #[-0])

#[NaN] != #[NaN]

#[NaN] !== #[NaN]

Object.is(#[NaN], #[NaN])

I'd also be happy with Set's approach of normalizing -0 to 0, in which case Object.is(#[+0], #[-0]) would be true.

(I've edited this comment because it originally advocated for a more conservative change than what I'd actually prefer, but I think it'd be better to have it accurately reflect my beliefs - please keep in mind that some emoji reacts might be for the earlier version that advocated for #[NaN] == #[NaN].)

Zarel commented

As an example, what if someone uses a record/tuple as a coordinate?

const coord = #{x: 0, y: 3};

And then they decide to move it around a bit:

const coord2 = #{x: coord.x * -4, y: coord.y - 3};
const isAtOrigin = coord2 === #{x: 0, y: 0};

isAtOrigin would be false, but I think most programmers writing code like this would assume it would be true, and if they did, they'd encounter no problems other than this specific comparison.

papb commented

@Zarel I think you have good points. If we're trying to reduce the chance of bugs by programmers that don't know the details of the language, probably what you suggest is the best indeed.

Also, programmers who are aware of such details, will either:

  • Memorize whatever rules end up being decided here
  • Always check on the internet these edge cases

So to be honest I think the decision is basically irrelevant for us who know these details... So why not benefit the unaware ones? :)

Where a value is stored shouldn't change how its identity is understood. If we are comparing records and tuples based on their contents, we should use the existing identity rules the language has for their contents.

Also... as long as we're here, IEE-754 defines -0 as equal to 0 and NaN not equal to NaN, not for lulz, but because you can end up in situations like @Zarel points out, and NaN very purposely has no identity (NaN/NaN shouldn't be 1). These are not quirks of JavaScript but important invariants of how our chosen floating point number system works.

If we are comparing records and tuples based on their contents, we should use the existing identity rules the language has for their contents.

JavaScript has many different identity rules, and we'd have to pick one (or rather, one for each relevant situation). The decision in this thread was to pick Object.is for all of them, as the most consistent. Other decisions are possible, but "use the existing identity rules" doesn't actually uniquely identify one possible such decision.

@bakkot If I do record === record i'd expect === all the way down. If i do record == record i'd expect == all the way down, and if i do Object.is(record, record) i'd expect Object.is all the way down.

@Zarel

const coord2 = #{x: coord.x * -4, y: coord.y - 3};
const isAtOrigin = coord2 === #{x: 0, y: 0};

isAtOrigin would be false,

This is reasonably compelling to me. I guess I would be OK with the -0 -> 0 normalization behavior that Set performs, since we have that precedent.

Zarel commented

My first choice is === all the way down.

My second is -0 โ†’ 0 normalization like Set. Incidentally, similar normalization is used by array.includes and is specced as "same-value-zero equality":

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Equality_comparisons_and_sameness#Same-value-zero_equality

(Which I think reflects the understanding of the entire rest of the spec that treating -0 and 0 as unequal is a huge footgun.)

@papb says "So to be honest I think the decision is basically irrelevant for us who know these details" but I don't think this is true โ€“ I know all the details, and I still would probably get tripped up by a #[0] !== #[-0] inequality. You would either have to be careful every time you multiplied/divided two numbers, or every time you put any number into a record/tuple, adding a lot of boilerplate code that should be unnecessary.

I see in the update slides that -0 and +0 are still being considered not equal and I want to reiterate how incorrect that is. When a negative number underflows to 0 it has to keep its sign or further values extrapolated from that number will have the incorrect sign (this is also why normalizing the value to +0 is not correct). Due to this property, IEEE 754 punts sign normalization to equality, which @Zarel helpfully demonstrated above:

As an example, what if someone uses a record/tuple as a coordinate?

const coord = #{x: 0, y: 3};

And then they decide to move it around a bit:

const coord2 = #{x: coord.x * -4, y: coord.y - 3};
const isAtOrigin = coord2 === #{x: 0, y: 0};

isAtOrigin would be false, but I think most programmers writing code like this would assume it would be true, and if they did, they'd encounter no problems other than this specific comparison.

As for NaN, it doesn't break any calculation you might be doing to make it equal to itself (although it is against IEEE 754 to do so, and some people will argue that preventing that equality can halt forward progress as NaN intends), but as I mentioned above, I think breaking programmer's expectations about recursive equality is far more harmful than the benefit of being able to compare NaN (I was unable to find a single person who thought that non-recursive equality was a good idea, and many were surprised that such a question would even need to be asked).

Just a random person here, but I would really really prefer that IEEE 754 numbers act like IEEE 754 numbers, and not what "make sense". NaN != NaN is a pain in the ass for everyone but it is there for good reasons. NaN is not a value, it's a catch-all for "can't do this". It's SUPPOSED to be a pain in the ass, because it's a signal that your code screwed up somewhere. NaN also is not a number, it's not really even intended to be a value, it's an error. If NaN == NaN, then you're saying 0/0 == inf/0 , which doesn't seem helpful at all. You might as well assert that two uninitialized values in C have to compare equally.

Second, your computer's hardware isn't going to like you trying to tell it that NaN's are equal, and there are different encodings of NaN, so it's turns every floating point comparison into multiple ones.

Please don't randomly second-guess a standard "because it seems to make sense to me", especially when it's trivial to find out the reasons these things are why they are. I'm all for trying to find a better approximation for real numbers than IEE754, for interesting values of "better", but when every computer built in the last 40 years has worked a particular way I'd like people to please think more than twice before saying "let's just randomly change the rules in this particular use case".

new Set([+0])).has(-0) is true

I tested and this is correct. But I find it extremely surprising. But good! Where in the spec is this normalized?

Given this strange, surprising, and pleasant fact, I am leaning towards normalizing -0 to 0 with records and tuple and then adopting Object.is semantics on the result. Has most of the virtues of both Object.is and SameValueZero while avoiding most of their problems.

But first I want to understand how the spec already normalizes these for Sets and Maps. Thanks.

Btw, the Agoric distributed object system used to carefully preserve the difference between 0 and -0 in its serialization format. We defined our distributed equality semantics for passable values to bottom out in Object.is.

We changed this to let JSON always normalize -0 to 0. Our distributed equality semantics now bottom out in SameValueZero, which works well with that normalization.

The whole NaN !== NaN thing to me is a category error between thinking within the system of arithmetic that these values are about, vs thinking about the role these values play as distinguishable first class values in the programming language. In E there is a distinct arithmetic equality comparison operator that is a peer to <, <=, >=, and >. We call this "same magnitude as". NaN indeed is not same magnitude as NaN and -0 is same magnitude as 0. Note that the notion of magnitude that all these operators compare is about the role of these values as representative of arithmetic numbers. In JavaScript, we're stuck with == and === as the way you say "same magnitude as".

Object.is is about observable equivalence. It is about the role of these values in producing computation, and whether a difference produces observably different computation. Think about writing a functional memo function. Given a pure function of pure inputs, the memoization of that function should be observably identical to the original. This memoization has to compare current arguments against previous arguments. A pure function cannot give different results for a NaN now vs previously. A pure function can give different results for 0 and -0. The memo had better compare inputs on that basis.

If the memo is built naively on Sets and Maps, it will work correctly on NaN but memoize incorrectly on -0.

Some other unfortunate anomalies:

['a', NaN].includes(NaN); // true, good
['a', NaN].indexOf(NaN); // -1, crazy
(_ => {
  switch(NaN) { 
    case NaN: return 'x'; 
    default: return 'y'; 
  }
})(); // y, insane. Would anyone expect that?

SameValueZero is also surprising, but less so:

['a', -0].includes(0); // true
['a', -0, 0].indexOf(0); // 1
(_ => {
  switch(-0) { 
    case 0: return 'x';
    case -0: return 'z';
    default: return 'y'; 
  }
})(); // x

new Set([+0])).has(-0) is true

I tested and this is correct. But I find it extremely surprising. But good! Where in the spec is this normalized?

In Set.prototype.add (which is also used by the Set constructor) there is an explicit normalization which turns -0 into 0. In Set.prototype.has and Set.prototype.delete the comparison operation against items in the underlying [[SetData]] slot is performed using SameValueZero.

Map does the same thing in its set, get, and delete methods.

Good, thanks.

I am in favor of always normalizing -0 to 0 in records and tuples. Such immutable containers would never contain a -0. We'd then compare using Object.is semantics. This has most of the benefits of both SameValueZero and of Object.is.

Did you all just skip my comment or something?

@devsnek Your comment mostly just says that you are in favor of recursive equality, meaning presumably that each of the four equality algorithms in JS would be extended so that invoking them on tuples would invoke these recursively on their contents. This would mean that, instead there being three values for which Object.is is not ===, there would now be an infinite set of them. I think that's bad, as I've already said above. There's not much else to say.

@bakkot i mean the part about zeros... sets/maps are a weird category because they deduplicate their keys. most set/map impls in languages either don't specialize for 0 and use whichever one is inserted first (like c++) or provide no default hashing implementation for doubles (like rust) but js chose to normalize it to +0. I don't think you can really take any useful conclusion for "how to store a property" from that.

Aside from maps/sets, it has been pointed out multiple times that normalizing or doing weird equality things to -0 is mathematically incorrect with regard to how ieee754 works. This crusade against having a functioning implementation of numbers needs to stop.

it has been pointed out multiple times that normalizing or doing weird equality things to -0 is mathematically incorrect with regard to how ieee754 works

The point in question here is precisely that IEEE754's numbers do not work like "mathemtical" numbers, so to say that something is "mathematically incorrect with regard to how ieee754 works" is... confusing.

It is safe to assume that people in this thread understand how IEEE754 works. Repeatedly pointing out that IEEE754 mandates a particular equality semantics is not particularly helpful. (It is also not all that relevant, because we are trying to decide how to compare structures which contain numbers rather than comparing numbers themselves.)

See @erights' comment above about "thinking within the system of arithmetic that these values are about, vs thinking about the role these values play as distinguishable first class values in the programming language".

This crusade against having a functioning implementation of numbers needs to stop.

We have a disagreement about what functionality is best. No one is crusading and no one is intentionally trying to make anything less functional.

This kind of rhetoric really isn't helpful. Please don't do this.

I don't think "best" has anything to do with this. You can either compare them correctly or incorrectly.

(_ => {
  switch(NaN) { 
    case NaN: return 'x'; 
    default: return 'y'; 
  }
})(); // y, insane. Would anyone expect that?

According to MDN, a switch statement compares with ===. NaN === NaN results in false. So, I'd expect that because it's consistent with the rules of the language. I'd expect #[NaN] === #[NaN] to be treated similarly, not to magically be true. It also seems that if you define that tuples are compared recursively with the same equality operator, #[NaN] === #[NaN] would result in it having the same truth table as NaN === NaN, and not === needing to contextually know whether or not it's comparing things inside tuples. Similarly, Object.is(NaN, NaN) evaluates to true; that's not a mathematical comparison though, that's object equivalence. Object.is([NaN], [NaN]) evaluates to false, as does Object.is([1], [1]).

I'd prefer something to be weird but consistent over "makes sense to me but only works in certain contexts".

Aside from consistency (which is, in general, a totally valid argument), does anyone have an application/use case where it would be useful to get this sort of nested treatment of NaN/-0 in ==/===? (EDIT: This was answered in #65 (comment) . Also, @waldemarhorwat pointed out, using something like this to represent complex numbers.)

Separately, if you expect this recursive treatment for === in NaN/-0, do you expect == on Records and Tuples to do things like compare BigInts and Numbers as equal when they have the same mathematical value?

Personally, I think all of these expectations around recursiveness face the category error @erights pointed out: because we don't have a syntactically convenient "are these exactly the same thing" vs "are these IEEE754/mathematically the same" operator difference, people are going to use === when, for Records and Tuples, the real goal is to find whether they are exactly the same thing.

I claim that "are these the same thing" is actually what people are trying to ask when they compare Records and Tuples with ===. I'd say, we should go with the answer to the question they're intending to ask, and let programmers manually recurse and apply the right IEEE754/mathematical comparison operator if that's what they really want.

If someone had a use case for the recursive "are these IEEE754/mathematically the same" semantics, that might convince me otherwise on this question. My current intuition is, mathematical comparisons in these contexts is just a footgun, not the intended semantics at usage sites.

I'm not sure how you'd all weigh this objective, but one goal of this proposal is to permit implementations to optimize ===, taking advantage of how the structures will never change. But if === is an operation on an equivalence class of objects, such an optimization through techniques like "hash consing" (reusing the same structure if the contents are the same, so === is cheap if they are the same) become impractical/impossible. (Aside: Not all engines will do this optimization, though, and it's likely that it sometimes won't kick in even on engines that do perform it. Ultimately, the situation would be similar to strings, where only some engines optimize with rope data structures internally, and they differ in their heuristics.)

(I'd be OK, I guess, with normalizing -0 to 0--it seems like the least harmful option in this thread--but I still don't really understand the motivation, and it seems a little unfortunate to sort of randomly lose the ability to represent -0 within a Record/Tuple. So I agree with @bakkot 's comments at the start of this thread)

Aside from consistency (which is, in general, a totally valid argument), does anyone have an application/use case where it would be useful to get this sort of nested treatment of NaN/-0 in ==/===?

An example was posted above, twice in fact.

Separately, if you expect this recursive treatment for === in NaN/-0, do you expect == on Records and Tuples to do things like compare BigInts and Numbers as equal when they have the same mathematical value?

I expect == to do what == does, === to do what === does, and Object.is to do what Object.is does. I'm actually not aware of any languages with tuples where equality is defined in terms of something other than recursively applying the operator to the children. They either don't come with their own equality (a op b doesn't compile, and you have to define your own op implementation for it) or they are defined by their children (a op b is true if the children are in the same order and op is true for them).

Personally, I think all of these expectations around recursiveness face the category error @erights pointed out: because we don't have a syntactically convenient "are these exactly the same thing" vs "are these IEEE754/mathematically the same" operator difference, people are going to use === when, for Records and Tuples, the real goal is to find whether they are exactly the same thing.
I claim that "are these the same thing" is actually what people are trying to ask when they compare Records and Tuples with ===. I'd say, we should go with the answer to the question they're intending to ask, and let programmers manually recurse and apply the right IEEE754/mathematical comparison operator if that's what they really want.

As has been noted above, people don't often want that. Even if your code has signed zeros at some point, you usually don't even know about them or worry about them, because == and === handle them correctly. If you're really looking for object equivalence, you already know about Object.is, and you can just use it. I would even argue that, since most people aren't experts in IEEE754, they will accidentally feel the need to treat -0 and 0 differently when they actually don't want to do so. There will be special cases, like memoization, but multiplying numbers is more common that writing memoization tools.

If someone had a use case for the recursive "are these IEEE754/mathematically the same" semantics, that might convince me otherwise on this question. My current intuition is, mathematical comparisons in these contexts is just a footgun, not the intended semantics at usage sites.

Like the example shown above twice, the intuition of most programmers will likely disagree with your intuition here, especially those coming from other languages with these kind of structures. I think the code that was posted not working would be a major major major footgun.

I'm not sure how you'd all weigh this objective, but one goal of this proposal is to permit implementations to optimize ===, taking advantage of how the structures will never change. But if === is an operation on an equivalence class of objects, such an optimization through techniques like "hash consing" (reusing the same structure if the contents are the same, so === is cheap if they are the same) become impractical/impossible. (Aside: Not all engines will do this optimization, though, and it's likely that it sometimes won't kick in even on engines that do perform it. Ultimately, the situation would be similar to strings, where only some engines optimize with rope data structures internally, and they differ in their heuristics.)

Yes, -0 and NaN would make this more complex, but I suspect not by much. An implementation will most likely have a slow path when the two structures are not reference equal where it can apply other checks. Note that this doesn't have to literally be recursing through the structures, you can take advantage of known shapes and whatnot to skip that. There are a lot of very fast languages with tuple equality we can take notes from.

(I'd be OK, I guess, with normalizing -0 to 0--it seems like the least harmful option in this thread--but I still don't really understand the motivation, and it seems a little unfortunate to sort of randomly lose the ability to represent -0 within a Record/Tuple. So I agree with @bakkot 's comments at the start of this thread)

As I said above, normalizing -0 to 0 is incorrect because it can cause operations involving very small or very large numbers to end up with the wrong sign.

An example was posted above, twice in fact.

The example at #65 (comment)

const coord = #{x: 0, y: 3};

const coord2 = #{x: coord.x * -4, y: coord.y - 3};
const isAtOrigin = coord2 === #{x: 0, y: 0};

works fine if we always normalize -0 to 0 and then use Object.is semantics. Is this the concrete example you have in mind?

@erights yes, but as I keep saying you can't normalize -0 to 0...

Yes, I understand we disagree on -0 normalization.

I appreciated your concrete example. If you could illustrate your other points with concrete examples that seem representative of real issues, that would be awesome. Thanks.

Zarel commented

Here's an example of -0 normalization causing problems:

const coord = #{x: 1, y: -3};
const coord2 = #{x: coord.x / 1e350, y: coord.y / 1e350};

const isBelowOrigin = coord2.y < 0 || Object.is(coord2.y, -0);

Apologies for missing the previous example. I've updated my earlier post, and also mentioned that @waldemarhorwat had previously raised the possibility of complex numbers. I share @erights ' interest in further examples.

Now we know that it's possible to construct a case where -0/NaN semantics may be useful, I think the next step would be, how do we prioritize this vs other invariants/goals that have been posited.

I can think of other things that would be useful for these point/complex number cases--for example, it'd be great if < could compare their magnitude, right? It would also be great if I could make my own string-like rope type such that I could use === for comparison, and it'd compare the string contents, not the details of the structure, which would allow libraries to have further control over rope heuristics and performance (cc @sebmarkbage who requested this kind of thing).

But at some point we draw a line and say, "even though I can think of a use case, this can be accomplished other ways, and doesn't need to be part of the built-in semantics of JS". The existence of some use cases doesn't make the other committee members' goals go away; they remain in tension, and we have the opportunity of drawing a tradeoff one way or another. In particular, the idea to apply === and == recursively is in tension with the goal of not making them any harder to understand then they are right now; there's already a lot of cases.

Some possible solutions for the scenarios above:

  • As long as we don't normalize, you should be able to (very easily) write a comparison function in JS. This would be in parallel to how you have to write comparison functions for everything else in JS, and just get to use === if you're asking if your user-declared type is actually the same value, with the one-off exception of Numbers.
  • In a future value types proposal (if we figure out how to do one), we could make a declarative way to note that particular fields are compared IEEE754-wise. Points and complex numbers would fit into such a scheme.
  • (Operator overloading could implement < and == maybe, but this is a tangent)

Personally, I do still think this comes down to the category difference @erights raised above. That === on Numbers compares them in their domain through mathematical rules doesn't rule out === on Records comparing to see if they're the same Record (just like === on Objects compares them to see if they're the same Object).

Unlike APL and its descendants (which I think have a very cool programming model), we don't have much of a pattern in JavaScript of operators transparently applying recursively into data structures' contents; it's very unclear to me why this is the place we need to start.

@littledan isn't one of the main points of these structures that they don't have their own identity, but rather have an identity based on what they contain?

@devsnek Short answer: Yes, that's why === on Records and Tuples should check whether they have the same identity, i.e., that they contain exactly the same things.

Long answer: There's two ways to think about primitives in JavaScript, that are isomorphic to each other, in that they each describe the actual language:

  • One is that they are these things where you could have multiple different primitive values which equal each other, and the language is so thorough about it that you can never see the difference between two equal primitives--their identity doesn't leak. One part of preventing this leak is overloading === for each primitive such that they're compared by contents. (And, as a special case which often surprises JS programmers, Numbers, on ===, also take into account IEEE754 comparison logic, so that you can access this logic without also plunging into the horrors of ==. Fortunately, this special case doesn't leak the identity of the primitives, so it's not the end of the world.)
  • The other way is to see primitives, is that if they have the same contents, then they're actually the same thing. As if they're always hash-consed/interned. If you compare two strings with ===, you're checking, "is this the same string?", in the same sense as comparing two objects with === checks if they're the same object. In a real implementation for === on strings, you're probably checking by the contents some of the time, since sometimes they'll be represented in different places in memory, but in the JS of this mental model, it's simply, "Are these Numbers falling into the -0/NaN special cases? OK, otherwise, are these the same value?" The JS spec is largely written in this mental model, in how it uses "is" when comparing values, or the name of the algorithm "SameValue".

It's this second intuition, which I find more intuitive, that makes me think, "the identity of a Record or Tuple is the whole value, just like the identity of a string is the contents of a string", and that === on the domain of Records should compare by identity, as it does for all non-Number types.

This mental model doesn't rule out the creation of new primitive types in JS which get added to ==='s exception list. But it gives a different polarity for the "default" that Records and Tuples would fall into.

@littledan right so... given different people have different intuitions, and that we have both === and Object.is in the language, why not enable both? It also means you don't have to explain it like "they have the identity of their children except for certain double bit patterns".

right so... given different people have different intuitions, and that we have both === and Object.is in the language, why not enable both?

I hope we can settle on one general mental model for how this feature fits into the language; smooshing together multiple conflicting ones doesn't feel like it'll lead to good outcomes. So, I don't think different intuitions is a good reason to enable both things. Different use cases can be. That's why we've been discussing use cases, and how important/prevalent/necessary they are.

=== is a lot more terse and well-known than Object.is, so I think that we're likely choosing the semantics of the "default" comparison operator when we decide on === semantics, unless we work hard to educate JS programmers not to do === on Records and Tuples. I don't think it's a level/neutral comparison.

It also means you don't have to explain it like "they have the identity of their children except for certain double bit patterns".

I'm not sure what you mean; is this a comment about how the normalization of -0 to 0 would be bad? I guess that wouldn't affect the identity really, it'd be more a property of how construction of records and tuples affects certain double bit patterns. I guess I agree this would be a bit weird, but I don't think it'd complicate what identity is.

In all these alternatives, Records and Tuples would fully have the identity of their children in any case; the question is whether === should compare Records and Tuples by identity or recursively do === on the contents.

unless we work hard to educate JS programmers not to do === on Records and Tuples

Is the assumption here that complex memoization code will be way more common than wanting to put a couple of numbers into a record?

papb commented

@littledan You said:

the identity of a Record or Tuple is the whole value, just like the identity of a string is the contents of a string

Ok, but what is the "whole value"? Are { x: 0 } and { x: -0 } the same "whole value"? I don't think the answer to this question is trivial/immediate/intuitive. I could see reasons for answering 'yes' and for answering 'no'.

Therefore I think the discussion still persists, only shifting now to the concept of "whole value".

By the way, the only way to define an "identity" for the "whole value" that I can think of is via recursive equality. Is there any other way?

You also said earlier:

we don't have much of a pattern in JavaScript of operators transparently applying recursively into data structures' contents; it's very unclear to me why this is the place we need to start.

However, we also don't have much of a pattern for doing anything at all with data structures other than comparing them by reference (i.e. by their memory locations, so to speak). So, I think it makes sense for this to be "the place we need to start", since Records and Tuples are already bending what we know of === anyway. In other words, regardless of what's decided about #{ x: 0 } === #{ x: -0 }, the fact that #{ x: 1 } === #{ x: 1 } is true is already bending the usual concept of a === with a data structure. So not only this is the place we need to start, but we have already started.

papb commented

Therefore, since from what I can see, the concept of equality for a Record and Tuple cannot be defined unless by recursion, and we are recursing anyway, the least surprise would be to preserve the already existing behavior. To answer the first post:

#[+0] == #[-0]; // the same as `+0 == -0`, i.e. `true`

#[+0] === #[-0]; // the same as `+0 === -0`, i.e. `true`

Object.is(#[+0], #[-0]); // the same as `Object.is(+0, -0)`, i.e. `false`

#[NaN] == #[NaN]; // the same as `NaN == NaN`, i.e. `false`

#[NaN] === #[NaN]; // the same as `NaN === NaN`, i.e. `false`

Object.is(#[NaN], #[NaN]); // the same as `Object.is(NaN, NaN)`, i.e. `true`

This would enable the example of using them as coordinates from @Zarel example without surprises:

const coord = #{x: 0, y: 3};
const coord2 = #{x: coord.x * -4, y: coord.y - 3};
const isAtOrigin = coord2 === #{x: 0, y: 0}; // true!

I want to be unambiguous about this concept of "the same value": If two things are reliably distinguishable at all (e.g., through Object.is), I don't see them as the same "whole value". Maybe at some high level, but not in the sense I'm trying to get at. There are clearly multiple ways to define the semantics here; people in this thread have expressed that they'd find either way surprising, so we have to decide who we're OK with surprising.

I agree with @papb's comment.

Having a case where a === b is true (Object.is equality), but a[0] !== b[0] (strict equality) seems wrong to me. And specifically, making === sometimes do Object.is equality (for records) and sometimes do strict equality (for everything else) complicates the operator considerably.

And specifically, making === sometimes do Object.is equality (for records) and sometimes do strict equality (for everything else) complicates the operator considerably.

I don't know what this means. We are deciding what strict equality means for records. === will always do strict equality by definition.

In any case, I think adding an infinite set of values for which === and Object.is disagree, where currently there are exactly three such values, would complicate the operator considerably.

I think adding an infinite set of values for which === and Object.is disagree, where currently there are exactly three such values, would complicate the operator considerably.

I think this depends on your point of view. From my perspective, it's still only three values, and the operator doesn't change.

From my perspective, it's still only three values

... What? Currently x !== x robustly implies that Object.is(x, NaN) (indeed that used to be a common way to write that check). And x === y && !Object.is(x, y) robustly implies x and y are 0 and -0 (in some order). After this change, neither of those would be true.

I am sincerely confused by this claim.

@bakkot there are still only three values that cause those checks to return true instead of false, they just might be inside the tuples you're comparing.

If it helps, a lot of people don't think of these structures as values unto themselves, but rather "groups of values"

they just might be inside the tuples you're comparing

So... there are more than three values, because those checks return true instead of false when x or y are values other than 0, -0, or NaN. Right?

If it helps, a lot of people don't think of these structures as values unto themselves, but rather "groups of values"

The whole point is that we are reifying "groups of values " into values. A record is a value; that is the whole point of it.

@bakkot sorry to clarify: if you think of a proxy, it is its own value, but it derives its existence from something else. when you're talking about a proxy you're normally talking about its handlers or its target, not the proxy instance itself. in a similar way tuples, at least in other programming languages, are less about the tuple instance itself and more about what it contains. i think it's unintuitive for a lot of people to even think of a tuple as having behaviour.

papb commented

I want to be unambiguous about this concept of "the same value": If two things are reliably distinguishable at all (e.g., through Object.is), I don't see them as the same "whole value".

@littledan I am still confused ๐Ÿ˜… we haven't decided how Object.is will behave, so this seems cyclic reasoning to me.

papb commented

If it helps, a lot of people don't think of these structures as values unto themselves, but rather "groups of values"

@devsnek I think this thought is dangerous, and can lead to lots of confusion. A value should be nothing more nothing less than something that can be put in a variable. Like an object is a value, a record must be a value too. That thought (apparently had by a lot of people) is unhelpful, I think.

papb commented

In any case, I think adding an infinite set of values for which === and Object.is disagree, where currently there are exactly three such values, would complicate the operator considerably.

@bakkot What do you mean by "complicate the operator considerably"?

As you pointed out, with my suggestion, indeed x !== x would no longer imply Object.is(x, NaN). Instead, it would imply that x is either NaN or a record/tuple containing some (possibly deeply-nested) NaN.

At first, this seemed a bad idea to me, but giving it a chance and thinking more carefully, I don't think it is bad. In fact I now think it is the correct behavior. Just like IEEE 754 had good reasons to define NaN !== NaN, if you consider the example of a record representing a coordinate, the same good reasons are applicable to make a case for #{ x: 10, y: NaN } !== #{ x: 10, y: NaN }.

@papb There's only one plausible definition of Object.is for Records and Tuples--it recursively compares them for Object.is. That's not the topic of this issue.

papb commented

I would also like to rephrase another great point raised by @jridgewell that also favors my reasoning:

If we allow #[NaN] === #[NaN] to be true, then it will be possible to have two variables x and y which satisfy x === y but don't satisfy x[0] === y[0]. To me, this alone is so crazy that I don't even need my own previous arguments ๐Ÿ˜… I am convinced by this one alone.

papb commented

@papb There's only one plausible definition of Object.is for Records and Tuples--it recursively compares them for Object.is. That's not the topic of this issue.

@littledan Ah, great!! Sorry about that then. I'm happy to hear that. I thought it was also the topic of this issue because Object.is(#[+0], #[-0]) and Object.is(#[NaN], #[NaN]) are mentioned in the first post.

Object.is(#[+0], #[-0]) is open because it's plausible that we could normalize -0 to 0, as we do for Set and Map. Object.is(#[NaN], #[NaN]) is just there for completeness.

Ah, right, but I guess that's not the definition of Object.is but rather #[-0].

@papb yeah, saying it is not like a value was imprecise and hopefully my clarification helped.

papb commented

@bakkot @littledan Ah, ok. About #[-0], my opinion is that it shouldn't be normalized, based on the issue raised by @Zarel in this comment. Regarding precedent from Map and Set: I think Map and Set are distinct enough from Records and Tuples to make Zarel's argument have more weight.

@papb

const x = { get '0'() { return Math.random(); } };
const y = x;
console.log(x === y, x[0] === y[0])

that said, non-idempotent getters themselves are pretty strange :-) but it is already possible.

that said, non-idempotent getters themselves are pretty strange :-) but it is already possible.

For values of non-configurable, non-writable data descriptors, this will be unique.

of course, there's also:

const x = Object.freeze({ '0': NaN });
const y = x;
console.log(x === y, x[0] === y[0])

:-p

@papb As @ljharb points out,

it will be possible to have two variables x and y which satisfy x === y but don't satisfy x[0] === y[0].

is already true - just let x be an array whose first element is NaN (and y be x). That this would still be true when x is a tuple rather than an array doesn't (I would think) add any new surprises to the language, whereas adding an infinite set of new values for which === and Object.is differed would, in my estimation.

Maybe we can add Object.strictEquals so we can still compare the structures correctly even if === is broken.

@devsnek Leaving aside how we should define ===, if this were standard library functionality, I don't really understand why element-wise === is any more common/important of an operation than element-wise < or + (which certainly have their own use cases).

@littledan I'm confused, are you saying we shouldn't let === work on these structures at all?

@devsnek Sorry for being unclear; I'm still pushing for === having Object.is semantics on Records and Tuples. I mean, I don't see why element-wise IEEE754-style equality comparison on Records and Tuples is more common/important than element-wise < or +.

@littledan because the === operator already does that kind of comparison on numbers. if we had an operator for Object.is semantics (maybe adding that could solve this dispute) it would be fine for it to use Object.is semantics on the elements.

As for all the other various operators, maybe they're worth discussing, but this issue is about what === should do so I don't see much point in debating the other operators.

A couple of data points:

In Java's value ("inline") types proposal (Project Valhalla), two value types containing NaN are considered to be == equal (as described here, search for "equality" or "the legacy behavior of NaN"). And in their Records proposal, I believe .equals will use Double.compare for double fields, which means two records with fields holding NaN are considered to be .equals.

In Python the built-in data structures are allowed to assume their elements are reflexive, which means

>>> n = float("nan")
>>> (0, n) == (0, n)
True

(though (0, float("nan")) == (0, float("nan")) may be False).

why does factoring nan into a variable change how == works

why does factoring nan into a variable change how == works

It doesn't; n == n is False in my above example.

It is specifically when comparing tuples containing NaN that you can get True, exactly as is being discussed here.

you said that both n = float('nan'); (0, n) == (0, n) and (0, float('nan')) != (0, float('nan')).

float('nan') (typically) gives you a "fresh" NaN, such that float("nan") is float("nan") is (typically) False. But n = float('nan'); n is n is always True.

got it. it seems that python also treats -0.0 and 0.0 as equal, but can tell them apart using the is operator.

it is against IEEE 754 to do so, and some people will argue that preventing that equality can halt forward progress as NaN intends

If NaN === NaN produced a NaB (not a boolean) or threw an error, or even went into an infinite loop, that would prevent progress. Instead it returns false. If NaN conceptually means "we don't/can't know what number this is supposed to be" then returning false is just as bad as returning true. In fact it is still worse because even for such an unknown, we'd know the reflexive case x === x would be true even if we don't/can't know what x is.

Obviously we're not going to fix IEEE. But I do not accept that breaking reflexive equality was a good idea, even just within the domain of arithmetic.

@erights yeah i don't care about nan as much, as it doesn't really have bearing on the correctness of successful numeric operations. my argument there was more about staying consistent with the operator being used.

At #65 (comment) @Zarel offer the example:

const coord = #{x: 1, y: -3};
const coord2 = #{x: coord.x / 1e350, y: coord.y / 1e350};

const isBelowOrigin = coord2.y < 0 || Object.is(coord2.y, -0);

I don't understand what this is supposed to be an example of. It resorts to Object.is to reveal whether coord2.y is -0. But Object.is is not an arithmetic operator. It does not exist, for example, in IEEE. Is there an arithmetic, numerical example of the utility of -0? I'm sure there must be. But the only within-IEEE observable consequence I know of the difference is that, for example, 1/0 === Infinity and 1/-0 === -Infinity. If you've already fallen off the precision limits so far as to reach an infinity, especially if it is because you divided by some zero, you've probably already lost anyway and need to rewrite your algorithm.

What within-IEEE-arithmetic examples are there of useful calculations without infinities that go awry if a -0 were normalized to a 0?

What within-IEEE-arithmetic examples are there of useful calculations without infinities that go awry if a -0 were normalized to a 0?

I mean any -0. The example need not have anything to do with records and tuples. I am trying to understand what is it about numeric computations that motivates the two zeros, and what is lost if the distinction were collapsed.

Again, obviously, we are not going to fix IEEE. But I don't understand this and would like to before continuing the records and tuples debate.

Zarel commented

Is there an arithmetic, numerical example of the utility of -0?

As far as I'm aware, -0 and 0 only differ in two ways:

  1. Object.is can tell them apart.
  2. 1/0 is Infinity, while 1/-0 is -Infinity (and so on for other numerators).

The latter is arithmetic and numerical. You might think about it in the context of:

const otherFrame = #{
  time: -1 / 1e350,
  distance: -1,
};
const movedForwards = otherFrame.distance / otherFrame.time > 0;

movedForwards should be true, but if you normalized -0 to 0, movedForwards would be false.

I am trying to understand what is it about numeric computations that motivates the two zeros, and what is lost if the distinction were collapsed.

Very small negative numbers underflow to -0 because if they underflowed to +0 they would have the wrong sign when you tried to extrapolate further calculations or decisions from them. Additionally, multiplying 0 by a negative number results in -0. I'm not sure why that is but it is common enough (see above examples) that its definitely a foot gun if people have to explicitly watch out for it.

@devsnek do you agree with @Zarel 's assessment that it can only make an arithmetic difference if things explode to infinity? That, infinities aside, there cannot be any arithmetic difference? If not, could you show an arithmetic example not involving infinities?

@erights yes that sounds right. its worth noting though you might not be explicitly using infinity or dividing by zero, but something could have overflowed.

I gotta say, if the only cost on the arithmetic side is on the sign of infinities, I'm leaning ever farther towards normalizing -0 to 0 in records and tuples.

How important are the signs of infinities to actual useful numeric algorithms in practice?

I can't imagine it comes up that often but I think js is enough of a general purpose language that we can assume someone will run into that.

I'm not yet sold on normalizing -0 to 0. I would prefer if we could let all primitives be represented in Records and Tuples unscathed. Otherwise I worry this becomes yet another case for people to think about and consider whether it affects them. Users can always choose to normalize -0 to 0 themselves when constructing a Record or Tuple if that is what they want.

kfdf commented

As a novice javascripter who has stumbled upon this thread while being on a lookout for "how do they do this thing in javascript?" my initial thought was that any case where #[x] === #[x] is inconsistent with x === x would be a massive gotcha. But thinking of it, you can look at it from another perspective: it is not that Records and Tuples introduce more inconsitencies to the language, it's just -0 and NaN have another place to cause mischief, like they do wherever they show up... equal here, not equal there, and and even flipping signs at times...
One problem I see if === works as Object.is is that even if you don't do any "fancy" calculations (and I think the absolute majority of people don't) -0 can still easily show up and break things. The principle of least surprise for the absolute majority suggests to normalize -0 to 0. And those few who do expect things like -Infinity to show up in their code, they have to know what they are doing anyway.

Why should -0 be normalized to 0 when placed in Records and Tuples, but not elsewhere in the language? I can understand how IEEE754 is confusing and not the best choice for the JS's long-time only numeric type (that's part of why I worked on BigInt and now Decimal), but I don't understand why "putting a value in a Record or Tuple" is the place we should intervene and "fix" things, any more than we could've decided something like, if you assign a Number to a let or const-bound variable, then -0 is normalized to 0.

kfdf commented

It was "fixed" in Maps and Sets. Everywhere else it "just works" thanks to how === works. So it's either recursive === or, if it precludes some important optimizations, normalizing would be the second best choice.
That would be consistent with the expectations of a layman coder, otherwise, it is possible to get something like this:

function flipIfNegative(p) {
    return #{ 
        row: p.row > 0 ? p.row : -p.row, 
        col: p.col > 0 ? p.col : -p.col 
    };
}
flipIfNegative(#{ row: 5, col: 0 }) === #{ row: 5, col: 0 } // false, what the...?

The code is sloppy, but doesn't deserve to become bugged. If -0 appeared in "exotic" scenarios only, it wouldn't be much of a problem, but it can appear even if you treat numbers as integers.

papb commented

I don't think Map and Set deserve to be considered precedent for any decision here. The keys of Maps and elements of Sets have immense usage differences from a direct value in a structure. If you have foo = #{ x: -0 }, then you can directly ask yourself about foo.x. It should be -0. That's what it is!! I really don't like this normalizing idea. In Map and Set it is completely different, one does not simply access one key / element directly. They all become part of a larger abstract representation. It's very different.

papb commented

Thank you @Zarel for creating yet another example of how normalizing is bad!! @icefoxen also made a good job explaining/defending some of these reasons.

I would like to reiterate: IEEE754 has good reasons to do what they did. And even if you happen to disagree with that, """"fixing"""" it in only a specific part of the language will create a huge mess.

Imagine people refactoring all their objects into Records? If -0 is normalized to 0, then they won't, because they just can't.

papb commented

The code is sloppy, but doesn't deserve to become bugged.

@kfdf I disagree. Although this code not being bugged would be good on its own, that would be against many other issues I and others have raised. And since one sloppy code is much more fixable and understandable behavior, I believe your example code being bugged is a completely acceptable "price to pay".

papb commented

@erights Please take a look at this question in stack overflow: https://softwareengineering.stackexchange.com/questions/280648/why-is-negative-zero-important (in particular, the accepted answer)

kfdf commented

I don't think Map and Set deserve to be considered precedent for any decision here

It's not just them, what would be the most straightforward way to imitate Records and Tuples right now? String concatenation, or even JSON.stringify. But -0 doesn't survive it: Object.is(0, Number(String(-0))); Object.is(0, JSON.parse(JSON.stringify(-0)));
So -0 normalization is nothing unusual, perhaps not very explicit.

I gotta say, if the only cost on the arithmetic side is on the sign of infinities, I'm leaning ever farther towards normalizing -0 to 0 in records and tuples.

@erights I don't think I'm in favor of normalizing -0 to 0. If the choice is between "Record and Tuple can store all primitives except for -0" and "Records and Tuples containing -0 and 0 are not equal", I would choose the latter.

papb commented

It's not just them, what would be the most straightforward way to imitate Records and Tuples right now? String concatenation, or even JSON.stringify. But -0 doesn't survive it: Object.is(0, Number(String(-0))); Object.is(0, JSON.parse(JSON.stringify(-0)));
So -0 normalization is nothing unusual, perhaps not very explicit.

@kfdf I do not follow your reasoning.

[...] what would be the most straightforward way to imitate Records and Tuples right now?

Objects and arrays?

papb commented

By the way, my opinion so far has been built in the grounds of what seems less surprising, more consistent and more useful. However, I have not considered the optimization aspect that is mentioned in the readme overview:

Additionally, a core goal of this proposal is to give JavaScript engines the capability to implement the same kinds of optimizations for this feature as libraries for functional data structures.

Does the decision on equality semantics for -0 and NaN impact this aspect of the proposal? If yes, how? Where can I learn more about what optimization aspects could be affected?