tc39/proposal-decimal

toString() and normalization

littledan opened this issue Β· 18 comments

EDIT: I seem to have misunderstood Waldemar's proposal; see #12 (comment) for clarification.

In this post, @waldemarhorwat proposed that BigDecimal.prototype.toString() (and presumably ToString applied to BigDecimal) should cut off all trailing zeroes, and not distinguish between elements of the same "cohort" (that is, decimals that have the same mathematical value, but different numbers of significant figures/trailing 0's).

One piece of motivation for Waldmar's argument is that it could be seen as surprising if arr[5m] was not the same as arr[5.000m]. But personally, I find it a little surprising if toString always discards the trailing zeros that we'd otherwise work hard to preserve. It may be semantically wrong for end users if toString() is used on BigDecimal and it cuts off trailing zeros unexpectedly.

I see a few options that we might take to square this circle:

  • Just tough it out and teach people to use Intl.NumberFormat when they want trailing zeroes represented (you should probably be doing something locale-specific when displaying numbers to people anyway).
  • Require that the option of whether or not to display trailing zeroes is provided to the toString algorithm, and make the ToString algorithm simply throw on BigDecimal, either through
    • Having two separate methods, BigDecimal.prototype.toStringNormalized() and BigDecimal.prototype.toStringWithTrailingZeroes(), and not have a toString() method.
    • Make the second parameter an options bag that's required to have a showTrailingZeroes option, and if it's not provided, throw.
  • Tough it out the other way and just include trailing zeroes in toString(), while discouraging developers from using BigDecimal as an array index.

Thoughts?

Maybe add an argument to toString much like Number's toString?

1.0n..toString(4) gives us 4 decimal places. 1.0n..toString(0) returns no decimal places.

For Number literals 10..toString need because we have shorthands .1 or 10.. For BigInt literal and for BigDecimal we don't need this because we have "m or n prefix escaping" - 10m.toString() or 10.0m.toString().

I’m confused; in this proposal, 5m and 5.00m are not the same identical number?

As I understand 1.0m != 1.000m because count of trailing zeros indicate BigDecimal's precision. The same as 1.0_f32 can't direct compare with 1.0_f64 without explicit conversions in some languages

Different types i understand; precision to me doesn’t seem like it should be part of identity.

@ljharb This is an interesting point; maybe you could file another issue if you want to pursue this further. This issue is about working out the semantics for toString under the assumption that BigDecimal would support trailing zeroes. As the README points out, trailing zeroes are a feature of many Decimal types, and they are important for many applications. See background in http://speleotrove.com/decimal/decifaq1.html#tzeros .

I'll try to find time to file another issue for that; as for this one, if 5.0d !== 5.00d, then I'd expect String(5.0d) !== String(5.0.0d).

It's always possible to state more invariants. I'd like to hear your rationale.

Java's BigDecimal has special method for comparison

Let's continue the comparison discussion in #11. As mentioned in that issue, I'd like to use @waldemarhorwat's proposal as a starting point for both toString() and equality semantics.

My rationale is that the string representation of two different things should be different (if the toString output is going to be useful at all). If 5.0d and 5.00d are the same (===) then they should have the same string output; if different, different.

Waldemar's proposal was that the two BigDecimals 5.00m and 5.0m would be different values but be ===, like 0.0 and -0.0. I think that if two things are SameValue, they should be indistinguishable in their behavior.

Waldemar's proposal was that the two BigDecimals 5.00m and 5.0m would be different values but be ===, like 0.0 and -0.0.

No, my proposal was not that that example (or, for even more fun, 6m-6m and 16m-16m) be different values. I want them to be the same. It's clear that they must be ===. They should be indistinguishable as well.

@waldemarhorwat Thanks for correcting my misunderstanding of those past email threads. The current semantics in the README is that they would be the same, indistinguishable, normalized.

For background, if you were to discriminate IEEE decimal cohort members such as 5.00m and 5.0m, then you'd also need to discriminate cohort members such as 100m and 2000m/20m. The first one produces 100m. The second one produces an integer equal to 100 but which can't even be written down as an integer if you discriminate cohort members. That way lies madness.

sffc commented

This issue hasn't been updated in almost 5 years but I wanted to weigh in here.

My understanding is that the current state of the proposal uses a "hybrid" normalization strategy where the data model retains Decimal128 but comparison and toString use normalized values by default. At the same time, Intl.NumberFormat retains the trailing zeros for reasons I've discussed on the record at multiple TC39 meetings (and am willing to write up again here upon request).

This introduces a potential inconsistency: toString and toLocaleString may print the values at different precisions.

Do we want to make toLocaleString more consistent with toString or more consistent with Intl.NF.p.format?

Note that Temporal is considering introducing some cases where toLocaleString passes some arguments (such as the calendar) into Intl.DTF that otherwise aren't available, so I think it's not too bad if Decimal chose to do something similar.

CC @jessealama

(Sorry this got dropped.)

IMO a disagreement between toString and toLocaleString is fine. I would hope that, over time, it would become common knowledge for decimal that those two do different things, for good reasons. Or, in other words, concerning the question

Do we want to make toLocaleString more consistent with toString or more consistent with Intl.NF.p.format?

I'd lean toward being more consistent with Intl.NF.p.format.

It seems to me the decimal integration is a bit simpler than the Temporal integration in that it should be possible to pass the full underlying decimal data to Intl.NF.p.format. Or perhaps I'm missing something?

sffc commented

I can buy into the comparison operations working on normalized values, but I think it's worth keeping open the discussion of whether String(decimal) and decimal.toString() and especially JSON.stringify(decimal) should retain the whole data model. I would like this to still be an open discussion in Stage 2.