tc39/proposal-math-sum

sumPrecise name and "floating point" language may be misleading

Opened this issue ยท 11 comments

Hi @bakkot, thanks for this proposal!

When I was reading the Math.sumPrecise proposal tweet from @robpalme, I somehow read into it that it would perform precise floating point math, to avoid the classic 0.1 + 0.2 = 0.30000000000000004 unusual behavior from JavaScript.

But running the polyfill in Node.js does not lead to other results:

const Math2 = await import('./sum.mjs')
Math2.sum([0.1, 0.2])
> 0.30000000000000004

But this appears to be a separate issue, which @jessealama @apaprocki @jirkamarsik and previously @littledan have worked on in the Decimal proposal:

0.1m + 0.2m = 0.3m

I wonder if the name sumPrecise would confuse other users (as it did me) enough to consider it when looking for solutions for floating point arithmetic problems in JS.

Yeah, we were aware that there would be this confusion. But the name does need to somehow indicate that this is more precise than normal floating point summation, and I think any name with that property would confuse people as it did you. So I don't see a way to avoid this problem. At some point people will have to learn how floating point numbers actually work.

For what it's worth, this does perform precise floating point math. The problem here is that writing 0.1 does not represent the real number 0.1 but rather the real number 0.1000000000000000055511151231257827021181583404541015625. Similarly for 0.2 and 0.3. If you add the real number actually represented by 0.1 to the real number actually represented by 0.2, and then take the closest JavaScript number value to the result, that is gives you 0.30000000000000004.

That doesn't generalize to adding more than 2 numbers normally - x + y + z is not necessarily the same as adding the real number represented by x to the real number represented by y and the real number represented by z - but does when using sumPrecise.

Understood, the naming does pose a dilemma. I wonder if there would be any other naming options which could describe properties of this solution in an alternative way ๐Ÿค”

Anyway, short of finding another suitable name, happy to either leave this open for visibility to others or close it - whatever fits wishes of the project.

It was originally called Math.sumExact. I thought Math.sumPrecise was less likely than sumExact to cause that confusion, though still fairly likely (and I couldn't really articulate why I thought sumPrecise was better, either.)

No one made any other suggestions. I half-suggested Math.fsum since that's what the Python method is named, but I don't really like that either because it's just not informative at all.

Reading through a bit of Shewchuk's paper and researching about related material, 2 additional suggestions came to mind:

  • Math.adaptiveSum()
  • Math.stableSum()

Neither of those really suggest "this is more precise than naive summation".

May I suggest:

  • Math.sumFiner()
  • Math.sumStricter()

Something with -er postfix suggests its better precision but not exact precision.

Rather than treating the problem of user confusion that Math.sumPrecise([0.1, 0.1, 0.1]) doesn't equal 0.3 exactly as something to be solved by renaming the method, I think it would make more sense to name it in a manner that would not need to be changed after proposal-decimal is added into TC39. Once that proposal lands, a method like this will return an exact sum and it won't be reasonable to then change the name.

I like Math.sumExact because AFAICT the proposal here doesn't implement the full adaptive range in Shewchuk's paper and instead just goes to arbitrary precision to return an exact result, however there are some other options:

  • Math.sumHigherPrecision or Math.sumSlowPrecise

@Event-Horizon's suggestion of an -er postfix could be good and would still be a valid name after proposal-decimal lands.

I also support keeping the name Math.sumPrecise because 1.) it is a heavy load to put on a proposal like this to educate users and programmers about binary and decimal representations and 2.) a name with that aim in mind may not do much to accomplish it.

Once that proposal lands, a method like this will return an exact sum and it won't be reasonable to then change the name.

This method will never accept Decimals (for the same reason it doesn't accept BigInts), so I don't think the possible addition of a Decimal type to the language has much bearing here.

Ok. Sorry for my confusion. Nevertheless I think we should not worry too much about picking a method name in order to educate users on floating point in JS generally. Rather we should pick one that describes the method well, which Math.sumPrecise does just fine.

Sorry if this is an ignorant question - Does this method's name need to express the fact that it's doing more precise summation? The current README says:

Summing a list is a very common operation and is one of the few remaining use cases for Array.prototype.reduce. Better to let users express that operation directly.
and also
Also, summing a list of floating point numbers can be done more precisely ... a fact which few JavaScript programmers are aware of

How much slower is this precise approach? At what point would I not want to reach for it? Unless this method is much slower and likely to be a footgun, would it be better to just call it Math.sum and consider a Math.looseSum/Math.impreciseSum or something for the naive implementation? It feels like the second would be less likely to lead to confusion if the 99% use-case is compatible with the "more correct" implementation.

(Asking because I worry about the same confusion with the "precise" wording)

If the risk of performance footgun is high, I'd vote for Math.strictSum. I actually think the ambiguity around what strict means is more likely to get users to look up the method and understand the use-case and tradeoffs

bakkot commented

Optimistically for lists of > 100 elements it will be perhaps 4x slower than naive summation implemented natively would be, but that's just from eyeballing graphs from this post. As a user it will still probably be faster than your own code, at least until your code gets into a higher optimization tier in the engine. Though to be clear, these are at this point just my personal guesses; it's hard to know this sort of thing prior to actual implementations.

Whether that slowdown matters depends on your perspective, I suppose. Engines expressed reservations about using the name "sum" for anything more complicated than naive summation.

strictSum isn't a terrible name, though I'd probably spell it sumStrict for discoverability reasons. I don't know that we'd really be benefitting users by hiding fact that this algorithm is in fact as precise as you can get, though - if they're asking the computer to do math, they're going to have to learn about its limitations at some point.