tc39/proposal-bigint

Axel Rauschmayer's alternative deserves serious consideration

allenwb opened this issue ยท 20 comments

Axel Rauschmayer has an alternative proposal for integrating arbitrary precision integers into ES. It provides similar functionality to the present Integers proposal, makes different trade-off, and in someways it is a simpler and less intrusive extension to the current language. It seems like a plausible alternative to the current proposal.

Some details of Axel's proposal have been discussed under issue #30. But I don't see a comprehensive evaluation/analysis of Axel's proposal as an alternative to the current proposal. I believe that such an analysis/caparison is something that the Integer champions should undertake for presentation to TC39.


A Further Simplification

Axel's basic premise is that "Number is a supertype of Int and Double: Number = Int โˆช Double". Instead of introducing new Number subtypes into the language, an further simplification would be to eliminate the supertype/subtype concept and simply expand the value domain of the ECMAScript number type.

Instead of defining number as "The Number type has exactly 18437736874454810627 (that is, 264-253+3) values, representing the double-precision 64-bit format IEEE 754-2008 values..." we could define it something like:

  • The Number type consists of all integer values and the 18437736874454810627 IEEE-754-2008 double-precision 64-bit values. The integer values are called "precise numbers" and the IEEE-754 values are called "imprecise numbers". Note that a precise number such as 1024 and its IEEE equivalent 1024.0 are distinct values within the single Number type.

This simplified definition, can then be mapped on Axel's proposal with "precise number" replacing the Int subtype and "imprecise number" replacing the Double subtype. Other then terminology and perhaps some API simplification, Axel's proposal could be largely unchanged.

I can't imagine we could actually get away with breaking 2**53 + 1 === 2**53, which this proposal does, no? Among other things, it'd break anyone using numbers for bit-packing, which is a nontrivial number of people.

@bakkot That's is the big question that needs to be explored. It seems implausible that very much code actually depends upon integer results in excess of 2**53 being inaccurate or would break if they become accurate. Arguably, most JS programmer aren't even aware of that boundary.

I'm not sure what you are concerned about WRT "bit-packing". Do you have an example that would break? Implementations would still be free to internally optimize their representation of various sized integers, just like they can do now.

@allenwb I don't have a code sample on hand, but the kind of bit-packing I'm thinking of is using JS numbers to transmit data: it's entirely possible to use all 2^64-2^53 finite numbers to encode and decode data, and code which does so is likely to assume that numbers larger than 2^53 have the semantics of IEEE doubles, not integers.

Presumably, any such transmission would have to be via ArrayBuffer/Float64Array. Storing a precise number into a Float64Array element would coerce to an imprecise number exactly as they do today. Retrieving a value from a Float64Array would produce a imprecise number, also just like today.

Storing a precise number into a Float64Array element would coerce to an imprecise number exactly as they do today

Er... I believe Float64Arrays precisely represent all finite JS numbers. No?

Values in the range Number.MIN_SAVE_INTEGER ... Number.MIN_SAVE_INTEGER can be precisely represented using IEEE Doubles. Whether or not any particular computational result in that range is precise depends upon the specific computation and the values that seeded.

By,

Storing a precise number into a Float64Array element would coerce to an imprecise number

I meant that any precise integer value would be converted to a IEEE Double. Whether or not that double is a precise representation of the original integer depends upon whether or not the integer was in the MIN_SAVE_INTEGER to MIN_SAVE_INTEGER range. Regardless, I'm suggesting that all IEEE Double values, including those in the MIN_SAVE_INTEGER to MIN_SAVE_INTEGER range are part of the "imprecise numbers" subset of the Number type.

Another way to think about it is that storing to/from Float64Array works exactly like it does today. It consumes/produces Number values that are semantically treated as IEEE doubles.

I believe current code can treat as an invariant that any JS value with type number can round-trip through writing then reading from a Float64Array (possibly ending up with a NaN with a different bit pattern than the original, but still a NaN).

This proposal would break that, right? So I'm still confused about what "would coerce to an imprecise number exactly as they do today" means. AFAICT there is no coercion currently.

Is there real code that relies on that invariant that this would break? I'd be curious to see an example.

Ok it looks like a value going into a Float32Array that doesn't come out === is thus 64 bit and you put it in a Float64Array - how would Axel's proposal change how integers (32 bit or 64 bit) were retrieved from a Float32Array, where they'd already been truncated to 32 bits? (I'm a bit fuzzy on Typed Arrays so thanks for bearing with me)

@ljharb, it wouldn't change that, I don't think. But the above example relies on being able to precisely encode all JS numbers by putting them into Float64Array and reading out the resulting bits, and the proposal would break that.

I'm pretty sure that in the presence Axel's proposal (with my reinterpretation) any existing code would work identically to what it does today. Integer values greater than MAX_SAFE_INTEGER would roundtrip through the encoder/decode as imprecise IEEE singles/doubles, just like they would today.

Some (likely very rare) integer computations involving large integers that are not currently precisely representable might produce slightly different (non-precise) values that round to different Float64 values:

function IntegerSupportFeatureCheck() {
   //return true if precise integers are supported
   return 9007199254740991+1 != 9007199254740991+2
}

But any such calculations are already operating in a non-precise number domain. Other than that possibility, the encoder module encodes/decodes number as Doubles just like today.

Such an encoder could be enhanced to explicitly encode/decode precise integers, but that would represent a new code usage and for interpop reasons probably needs explicit opt-in.

Thanks for filing this bug. A few people who have been commenting on this repository have asked for more complete interoperation between Integers and Numbers, and Axel's proposal seems to be the cleanest logical conclusion of it that I've seen so far.

That said, I have a few concerns:

  • To get integer semantics for several (but not all) operators, you have to use new contextual keywords as infix operators. Users who forget to use the new ones will likely silently lose precision (throwing risks web compatibility). The Integer proposal here attempts to make the simplest, most ergonomic path (using operators) come to the right answer by giving existing operators Integer semantics.
  • In addition to changing semantics of existing operators on existing Numbers, it creates a distinction between 1.0 and 1, which didn't exist before. Existing pieces of code are likely to use these interchangeably (unlike in languages where the distinction always existed), which can lead to confusion.
  • This proposal creates multiple primitive values which are === but observably different. 1 === 1.0, but Number.isDouble(1) !== Number.isDouble(1.0). They need to be === or existing code will likely break; OTOH, you also need a Double sense of 1.0, or otherwise operators can't have a static sense of what their input and output types are.

While I doubt any code would break from making integers precise, things could break due to depending on Infinity being reached in finite time. I know I've used Infinity as a base case for some mathematical thing at least once (personally I wouldn't care if my own code breaks given that I can't even remember what it was, but I suspect there'll be a fair number of crude mathematical things using Infinity as a base case that people might care about).

Here's a basic example of code that would now become an out-of-memory error instead of it's current behaviour of giving a list of 1024 numbers:

function genNums(i=1) {
    if (i === Infinity) {
        return []
    } else {
        return [i, ...genNums(i*2)]
    }
}

genNums()
qzb commented

@littledan:

This proposal creates multiple primitive values which are === but observably different. 1 === 1.0, but Number.isDouble(1) !== Number.isDouble(1.0).

It's nothing new, 0 and -0 are also strictly equal and observably different at the same time:

function toBytes (number) {
  return new Uint8Array(Float64Array.of(number).buffer)
}

0 === -0                         // true
toBytes(0)[7] !== toBytes(-0)[7] // also true because 0 !== 128

Of course Axel's proposal moves this to whole new level, but still, it isn't something new nor counterintuitive.

@qzb You're making a false comparison. -0 and +0 is not the same as 1 and 1.0. The former has a mathematical distinction; the latter does not. The former values do not convey distinct types; the latter would.

qzb commented

@kgryte I wasn't trying to prove that -0 and +0 are the same. I've just wanted to show that problem described by littledan isn't something completely new - observably different values are indistinguishable using === operator.

But, to be honest, I don't like this distinction between 1 and 1.0 either. Not because of issues with === operator, it is just too gimmicky and too different from different language's approach.

Maybe this is splitting hairs a little bit, but +0 vs -0 has the distinction of being the only two Numbers which are === but still observably different. That is, if you have a Number x, and you already know x !== 0, you know that if x === y, then x and y are not observably different Numbers. Adding a distinct 1 vs 1.0 would change that.

I don't know what exactly you can claim from an appeal to math. On one hand, anything that we define with enough formality can be thought of as math. On the other hand, for real numbers, there is no distinction between -0 and +0.

IMO what's more relevant than math is that, historically, JavaScript had a syntax for IEEE 754 double precision arithmetic, which you can get at with syntax like 1 + 2, and changing the semantics of something as basic as that to something else is a big deal.

I have code with similar problems to @Jamesernator . Sometimes I expect to reach Infinity. Also, othercode which only does integer arithmetic (but expects rounding to occur above ~2^53) to become increasingly inefficient as the integers get bigger and bigger.

OK, these compatibility arguments sound very compelling to me. I also raised this prospect of Axel's proposal at the May 2017 TC39 meeting, but the committee was supportive of continuing down the current path of a distinct type.