tc39/proposal-intl-numberformat-v3

Rounding Options Puzzle

sffc opened this issue · 16 comments

sffc commented

I have a puzzle which has perplexed me.

Below, I list real-life use cases for how users want to round their numbers in Intl.NumberFormat. I am trying to figure out some set of options that is capable of expressing these various different rounding strategies.

Compact Notation Rounding

Input Style 1 Style 2 Style 3
1,234,000 1234K 1234K 1234K
123,400 123K 123K 123K
12,340 12K 12K 12.3K
1,234 1.2K 1.2K 1.23K
1,034 1K 1.0K 1.03K
.1034 .1 .10 .103
.1234 .12 .12 .123

English Descriptions

  • Style 1: When there are 2 or more digits before the decimal separator, round to the nearest integer. Otherwise, round to 2 significant digits. Strip trailing zeros.
  • Style 2: When there are 2 or more digits before the decimal separator, round to the nearest integer. Otherwise, round to 2 significant digits. Retain trailing zeros.
  • Style 3: When there are 3 or more digits before the decimal separator, round to the nearest integer. Otherwise, round to 3 significant digits. Strip trailing zeros.

Thoughts

Style 1 could be expressed as minFrac=0, maxFrac=0, and minSig=2, and when minSig is in conflict with maxFrac, minSig wins, except that we strip trailing zeros. In other words, we could make Style 1 be expressed as:

{
  minimumFractionDigits: 0,
  maximumFractionDigits: 0,
  minimumSignificantDigits: 2
}

However, this approach is not capable of expressing Style 2.

We could have an option like "applyFractionGreaterThanIntDigits", which would mean to use minFrac/maxFrac when there are a certain number of integer digits, and minSig/maxSig when there are fewer. This is not a very pretty option, but it is capable of expressing all three styles:

Option Style 1 Style 2 Style 3
minimumFractionDigits 0 0 0
maximumFractionDigits 0 0 0
minimumSignificantDigits 1 2 1
maximumSignificantDigits 2 2 3
applyFractionGreaterThanIntDigits 2 2 3

Currency Rounding

Input Style 1 Style 2 Style 3 Style 4
1 $1.00 $1 $1.00 $1.00
1.01 $1.01 $1.01 $1.00 $1.00
1.04 $1.04 $1.04 $1.05 $1.00
1.12 $1.12 $1.12 $1.10 $1.10

English Descriptions

  • Style 1: Round with 2 fixed fraction digits.
  • Style 2: Round with 2 fixed fraction digits, but strip trailing zeros if the fraction is zero.
  • Style 3: Nickel rounding: round to the nearest 0.05; display 2 fraction digits.
  • Style 4: Dime rounding: round to the nearest 0.1; display 2 fraction digits.

Thoughts

A simple boolean option "stripFractionWhenEmpty" would solve Style 2.

A simple boolean option "nickelRounding" would solve Style 3.

About Trailing Zeros

Note that minFrac already serves absolutely no purpose other than retaining trailing zeros.

Given that minFrac is really only about retaining trailing zeros, for Style 4, we could let minFrac be greater than maxFrac, but it is weird for a minimum to be greater than a maximum.

Since a lot of the problems in this section, as well as Style 2 in the previous section, involve various different ways of treating trailing zeros, maybe we could introduce a "trailingZeroStyle" option, an enum with several different options that encompass all of the use cases.

Distance Rounding

Input Style 1 Style 2
60 50 yards 50 yards
220 200 yards 200 yards
450 450 yards 450 yards
490 500 yards 500 yards
530 550 yards 500 yards
590 600 yards 600 yards

English Descriptions

  • Style 1: Round to the nearest 50.
  • Style 2: Round to the nearest 50 when below 500, and the nearest 100 when above 500.

Thoughts

Style 1 can be represented by a variant of nickelRounding. We could name the option nickelRoundingMagnitude, and if set, it would override fraction and significant rounding. Alternatively, we could allow minFrac/maxFrac to be less than zero, in which case they express the power of 10 at which you round.

Style 2 involves a cutoff. If we can't figure out how to support it, we could declare it out of scope.

Maybe we should throw the minFrac/maxFrac stuff out the door (keep it for backwards compatibility), and devise a whole new way of thinking about rounding strategies.

@echeran @ryzokuken

For the compact notation rounding, I was thinking that in the style of the idea of nickelRoundingMagnitude, we could explicitly control the style of compact rounding by keying the options by the magnitude of the compact number.

If this is what we have now:

{
  minimumFractionDigits: 0,
  maximumFractionDigits: 0,
  minimumSignificantDigits: 2
}

then what I'm thinking might be a new sibling key in that options map:

{
  roundingByMagnitude: 
    {
      2: {minimumFractionDigits: 0,  maximumFractionDigits: 0},
      1: {minimumFractionDigits: 0,  maximumFractionDigits: 0},
      0: {minimumSignificantDigits: 2},
      -1: {minimumSignificantDigits: 2},
    }
}

In that example, we could say that any value with a magnitude less than the lowest key value (-1) takes the options for that lowest key value (magnitude = -1). A similar thing for compact numbers with magnitudes 3, 4, and higher taking the options in the value of the largest key (mag = 2).

This functionality is still independent (orthogonal) to the need for stripFractionWhenEmpty, which happens after the conversion to compact (the "misc" in number -> compact -> rounding -> misc).

sffc commented

I think this is minimal set of options that covers all use cases except the distance cutoff case:

  1. trailingZeros:
    • "auto" = obey mininumFractionDigits or minimumSignificantDigits (default behavior).
    • "strip" = always remove trailing zeros.
    • "stripIfInteger" = remove them only when the entire fraction is zero.
    • optional: "keep" = always keep trailing zeros according to the rounding magnitude.
  2. nickelRounding:
    • true = the last digit before rounding should be a 0 or a 5.
    • false = the last digit can be any digit from 0 through 9 (default behavior).

In addition, we would make the following changes to the semantics of the existing options. Note: I am using minSig, maxSig, minFrac, and maxFrac instead of the real, longer names, only to make this shorter and easier to read.

  1. minSig, maxSig, minFrac, and maxFrac are allowed to coexist.
    1. If a minimum comes into conflict with a maximum, the minimum wins.
    2. Trailing zeros are retained according to the new trailingZeros option.
  2. minSig can be greater than maxSig, and minFrac can be greater than maxFrac.
    1. This enables the "dime rounding" use case.
  3. minFrac and maxFrac can be less than zero.
    1. This enables the first "distance rounding" use case.

I know options 2 and 3 are not very pretty, but I think they get the job done. It's a starting point.

Examples

// Compact Style 1 (default for notation: "compact")
{
    minFrac: 0,
    maxFrac: 0,
    minSig: 2,
    trailingZeros: "strip"
}

// Compact Style 2 (retain trailing zeros)
{
    minFrac: 0,
    maxFrac: 0,
    minSig: 2
}

// Compact Style 3 (more significant digits)
{
    minFrac: 0,
    maxFrac: 0,
    minSig: 3,
    trailingZeros: "strip"
}

// Currency Style 1 (default currency style)
{
    minFrac: 2,
    maxFrac: 2
}

// Currency Style 1 Alternative with trailingZeros: "keep"
{
    maxFrac: 2,
    trailingZeros: "keep"
}

// Currency Style 2 (strip trailing zeros if they are all zero)
{
    minFrac: 2,
    maxFrac: 2,
    trailingZeros: "stripIfInteger"
}

// Currency Style 3 (nickel rounding)
{
    minFrac: 2,
    maxFrac: 2,
    nickelRounding: true
}

// Currency Style 4 (dime rounding)
{
    minFrac: 2,
    maxFrac: 1
}

// Distance Style 1 (nearest 50)
{
    maxFrac: -1,
    nickelRounding: true
}

Thoughts?

sffc commented

Algorithm for resolving minFrac, maxFrac, minSig, and maxSig. I still need to think through the edge cases like 9.9999:

  1. Let x be the input number.
  2. Let mag be the base 10 algorithm of x, rounded down.
  3. Let minSigScale be mag - minSig + 1
  4. Let maxSigScale be mag - maxSig + 1
  5. Let minScale be the maximum of (-1 * minFrac) and minSigScale
  6. Let maxScale be the minimum of (-1 * maxFrac) and maxSigScale
  7. If minScale < maxScale, let maxScale be minScale

Problem: this algorithm doesn't work for the dime rounding case. :(

sffc commented

If we could reinvent the existing rounding settings from scratch, I might propose that they be:

  • fractionDigits = combination of [minimum/maximum]FractionDigits
  • significantDigits = combination of [minimum/maximum]SignificantDigits
  • trailingZeros = enum explained above ("keep", "strip", "stripIfInteger")
    • Default: "keep"

If both fractionDigits and significantDigits are present, we could say that they have the semantics described for Compact Notation Rounding, where fractionDigits is a "maximum" and significantDigits is a "minimum". The algorithm could be cleanly stated as:

  1. Let result be ToRawPrecision(x, significantDigits)
  2. If result.[[FractionDigitsCount]] <= fractionDigits:
    1. Let result be ToRawFixed(x, fractionDigits)

We still need to cover the nickel and dime rounding cases. We could shim it in as a more restricted version of ICU's rounding increment setting:

  • roundingIncrement = an integer, either 1 or 5 with any number of zeros.
    • Example values: 1 (default), 5, 10, 50, 100
    • Nickel rounding: { fractionDigits: 2, roundingIncrement: 5 }
    • Dime rounding: { fractionDigits: 2, roundingIncrement: 10 }

Here's how to express all the rounding styles using these four options:

// Compact Style 1 (default for notation: "compact")
{
    fractionDigits: 0,
    significantDigits: 2,
    trailingZeros: "strip",
}

// Compact Style 2 (retain trailing zeros)
{
    fractionDigits: 0,
    significantDigits: 2,
}

// Compact Style 3 (more significant digits)
{
    fractionDigits: 0,
    significantDigits: 3,
    trailingZeros: "strip",
}

// Currency Style 1 (default currency style)
{
    fractionDigits: 2,
}

// Currency Style 2 (strip trailing zeros if they are all zero)
{
    fractionDigits: 2,
    trailingZeros: "stripIfInteger"
}

// Currency Style 3 (nickel rounding)
{
    fractionDigits: 2,
    roundingIncrement: 5,
}

// Currency Style 4 (dime rounding)
{
    fractionDigits: 2,
    roundingIncrement: 10,
}

// Distance Style 1 (nearest 50)
{
    fractionDigits: 0,
    roundingIncrement: 50
}

Great! Okay, so then working backwards, how do we map this onto the existing options?

We could say that if both minFrac/maxFrac and minSig/maxSig are present, then we use a slightly modified version of the above algorithm for resolving them:

  1. Let result be ToRawPrecision(x, minSig, maxSig)
  2. If result.[[FractionDigitsCount]] <= minFrac: // TODO: minFrac or maxFrac here?
    1. Let result be ToRawFixed(x, minFrac, maxFrac)

As it is currently, if a trailing zero is to be retained, it must be "protected" by minFrac or maxFrac. In other words, the default trailingZeros: "keep" setting only keeps the trailing zeros if they are protected. A better name for that option might be trailingZeros: "keepMin"

The styles can then be expressed as:

// Compact Style 1 (default for notation: "compact")
{
    minFrac: 0,
    maxFrac: 0,
    minSig: 2,
    maxSig: 2,
    trailingZeros: "strip",
}

// Compact Style 2 (retain trailing zeros)
{
    minFrac: 0,
    maxFrac: 0,
    minSig: 2,
    maxSig: 2,
}

// Compact Style 3 (more significant digits)
{
    minFrac: 0,
    maxFrac: 0,
    minSig: 3,
    maxSig: 3,
    trailingZeros: "strip",
}

// Currency Style 1 (default currency style)
{
    minFrac: 2,
    maxFrac: 2,
}

// Currency Style 2 (strip trailing zeros if they are all zero)
{
    minFrac: 2,
    maxFrac: 2,
    trailingZeros: "stripIfInteger"
}

// Currency Style 3 (nickel rounding)
{
    minFrac: 2,
    maxFrac: 2,
    roundingIncrement: 5,
}

// Currency Style 4 (dime rounding)
{
    minFrac: 2,
    maxFrac: 2,
    roundingIncrement: 10,
}

// Distance Style 1 (nearest 50)
{
    minFrac: 0,
    maxFrac: 0,
    roundingIncrement: 50
}

Thoughts? @ryzokuken @echeran

In the previous comment, I see the progression of thought from fractionDigits and significantDigits and how it maps onto the existing options to achieve the originally categorized styles. But I don't know if that solves some of the outlier cases (ex: 123,400 and 1,234,000).

We have:

// Compact Style 1 (default for notation: "compact")
{
    minFrac: 0,
    maxFrac: 0,
    minSig: 2,
    maxSig: 2,
    trailingZeros: "strip",
}

But for Style 1, the input 1,234,00 after compact rounding becomes 123K, which has 3 significant digits, more than maxSig: 2. I think that was where we were noticing before in our previous discussion (@sffc @ryzokuken ) that the parameters to achieve Style 1 are not consistent across all cases in the original table, where each case corresponds to a different magnitude. And the current compact number format code (Style 1) effectively does magnitude-based logic for the compact number, just described differently (checking whether the num of digits in the integer part of the compact rounded floating point is < 2).

To the comment before that, that lays out 3 changes, in which changes ("options"?) 2 & 3 are admittedly ugly, I still prefer to avoid that if at all possible. I understand the motivation -- simplify the logic / algorithm -- but the tradeoff / cost comes in the mental overhead of mixed semantics. We allow minFrac to be greater than maxFrac only if we explain that we additionally reinterpret their meanings as guaranteed number of fractional digits including trailing zeroes, and max fractional significant digits. "Mental overhead" is a usually a sign of complexity -- in this case, it's the intertwining of 2 different semantics in one term. Allowing this kind of complexity to leak to the user seems undesirable to me.

I don't know how to get around the magnitude-dependent way in which the desired formatting behavior seem to be defined (ex: 123,400 -> 123K; 1,234 -> 1.2K). I'd prefer to have that fact just be represented clearly in the user-provided data, which makes it explicit and flexible for the user.

sffc commented

But for Style 1, the input 123400 after compact rounding becomes 123K, which has 3 significant digits, more than maxSig: 2.

Right. The behavior is unintuitive in the compact notation case, but it is consistent with the algorithm, which is clean and simple.

To the comment before that, that lays out 3 changes, in which changes ("options"?) 2 & 3 are admittedly ugly, I still prefer to avoid that if at all possible. I understand the motivation -- simplify the logic / algorithm -- but the tradeoff / cost comes in the mental overhead of mixed semantics.

Noted. We may be able to get around the mixed semantics if we use the roundingIncrement option for dime rounding instead of the minFrac > maxFrac solution.

Maybe we could propose the four options (trailingZeros, roundingIncrement, and the new fractionDigits and significantDigits options), and consider [min/max][Frac/Sig] as a historical artifact, which can still be used, but which have odd behavior in some cases.

Or maybe we can take the less-clean version of the algorithm for mixing minSig with maxFrac, combined with trailingZeros and roundingIncrement.

sffc commented

I don't know how to get around the magnitude-dependent way in which the desired formatting behavior seem to be defined (ex: 123,400 -> 123K; 1,234 -> 1.2K). I'd prefer to have that fact just be represented clearly in the user-provided data, which makes it explicit and flexible for the user.

Also noted. The option you suggested makes the magnitude cutoff explicit. We'd still need the roundingIncrement and trailingZeros options in order to cover currency and distance.

sffc commented

Note to self: I intend to make a little web app to demonstrate the three main approaches listed in this thread. This will help us better understand the limitations of each option.

sffc commented

Here's my latest attempt. I think it's the cleanest one so far. Keep maximum* talking about rounding magnitude, and minimum* talking about display magnitude (trailing zeros). Then, introduce a mode for both rounding and display magnitude to choose the smaller or the greater of the two conflicting options, called "loose" and "strict" below.

I also made an HTML page for playing around a bit, available here.

Examples

maxFrac = 0, maxSig = 2

maxMode=loose maxMode=strict
1234 1200
123 120
12 12
1.2 1
.1 0

minFrac = 1, minSig = 2

minMode=loose minMode=strict
80.0 80
8.0 8.0
.80 .8

maxFrac = 0, maxSig = 2, maxMode = loose

minSig=1 minSig=2
1034 1034
103 103
10 10
1 1.0
.1 .1
sffc commented

Just to make this more concrete: here is my updated set of options:

  • roundingPreference
    • "significantDigits" (default): ignore fraction digit settings if present.
    • "higherMagnitude": when maximumFractionDigits and maximumSignificantDigits conflict, favor the one that results in rounding at a higher magnitude (fewer significant digits in the resulting output)
    • "lowerMagnitude": when maximumFractionDigits and maximumSignificantDigits conflict, favor the one that results in rounding at a lower magnitude (more significant digits in the resulting output)
  • roundingIncrement: behavior proposed at #8 (comment): either 1 or 5 followed by any number of zeros.
  • trailingZeroDisplay
    • "auto": current behavior. Keep trailing zeros according to minimumFractionDigits and minimumSignificantDigits.
    • "stripIfInteger": same as "auto", but remove the fraction digits if they are all zero.

These options cover maxMode but not minMode from the previous post. I do not think we need to cover minMode at this time because it's not clear to me that there is a compelling use case for that option.

sffc commented

Bikeshed for the roundingPreference option:

  • roundingPriority: "significantDigits", "morePrecision", "lessPrecision" <--
  • roundingPriority: "significantDigits", "relaxed", "strict"
  • significantDigitsPriority: "auto", "relaxed", "strict"
  • significantDigitsPriority: "auto", "smallNumbers", "largeNumbers"

Bikeshed for the trailingZeroDisplay option:

  • trailingZeroDisplay: "auto", "exceptInteger" <--
  • trailingZeroDisplay: "auto", "exceptWhole"
  • trailingZeroDisplay: "auto", "exceptIfWhole"
sffc commented

Here's a visualization:

image

sffc commented

Please fill out this Doodle if you would like to attend a deep-dive to work this out:

https://doodle.com/poll/zbf68rcw6k9ztre9?utm_source=poll&utm_medium=link

sffc commented

2021-04-06: @gibson042 is now aligned with my mental model given the following explanation: maximumFractionDigits must be interpreted to mean, "round the number at a specific power of 10". So, for example, "maximumFractionDigits: 2" means "round the number at 10^-2" or equivalently "round the number at the hundredths place".

So, for instance, if we had the settings

{
    maximumFractionDigits: 2,
    maximumSignificantDigits: 2
}

Those settings mean:

  1. Round the number at the hundredths place
  2. Round the number after the second significant digit

Now, consider the number "4.321". maximumFractionDigits wants to round at the hundredths place, producing "4.32". However, maximumSignificantDigits wants to round after two significant digits, producing "4.3". We therefore have a conflict.

The new setting roundingPriority offers a hint on how to resolve this conflict. There are three options:

  1. roundingPriority: "significantDigits" means that significant digits always win a conflict.
  2. roundingPriority: "morePrecision" means that the result with more precision wins a conflict.
  3. roundingPriority: "lessPrecision" means that the result with less precision wins a conflict.

This resolution algorithm applies separately between the maximum digits settings and the minimum digits settings. So, for example, suppose you had

{
    minimumFractionDigits: 2,
    minimumSignificantDigits: 2
}

Consider the input number "1". minimumFractionDigits wants to retain trailing zeros up to the hundredths place, producing "1.00", whereas minimumSignificantDigits wants to retain only as many as are required to render two significant digits, producing "1.0". We again have a conflict, and the conflict is resolved in the same way.

I will work on additional examples and explanations. But I think we made a whole lot of progress in explaining this model in a way that makes sense.