unitsofmeasurement/uom-systems

Inconsistencies in liquid volume units (US vs. Imperial)

lbeaulac opened this issue · 6 comments

The definitions for non-SI liquid volume units in the CLDR class are inconsistent.

Both US and Imperial systems have the following internal relationships:

  • 1 Gallon = 4 quarts = 8 pints = 16 cups

But the Imperial quantities for those four units are larger than their US counterparts by a constant factor of about 1.2.

There are also parallel definitions for the fluid ounce, tablespoon and teaspoon, but the ratios are different:

  • 1 US fluid ounce = 2 US tablespoons = 6 US teaspoons = 480 US minims ~= 29.573529562 ml
  • 1 Imperial Fluid ounce = 1.6 Imperial tablespoons = 4.8 Imperial teaspoons = 480 Imperial minims ~= 28.4130625 ml

As can be seen above, the Imperial fluid ounce is actually smaller than its US counterpart, contrary to the larger volume units, where the Imperial unit is larger than the US unit. This is explained by differences in the number of fluid ounces that make up the larger units in the two systems. ie.

  • 1 US gallon = 128 US fluid ounces
  • 1 US quart = 32 US fluid ounces
  • 1 US pint = 16 US fluid ounces
  • 1 US cup = 8 US fluid ounces
    but
  • 1 Imperial gallon = 160 Imperial fluid ounces
  • 1 Imperial quart = 40 Imperial fluid ounces
  • 1 Imperial pint = 20 Imperial fluid ounces
  • 1 Imperial cup = 10 Imperial fluid ounces

Now, in the CLDR class, we see these units defined:
public static final Unit<Volume> GALLON = addUnit(CUBIC_INCH.multiply(231));
public static final Unit<Volume> GALLON_IMPERIAL = addUnit(LITER.multiply(454609).divide(100000));
public static final Unit<Volume> FLUID_OUNCE = addUnit(GALLON.divide(128));
public static final Unit<Volume> CUP = addUnit(FLUID_OUNCE.multiply(8));
public static final Unit<Volume> PINT = addUnit(FLUID_OUNCE.multiply(20), "Pint", "pt", true);
public static final Unit<Volume> QUART = addUnit(FLUID_OUNCE.multiply(40), "Quart", "qt");
private static final Unit<Volume> MINIM = MICRO(LITER).multiply(61.61152d);

The issue here is that the PINT and QUART quantities are using the US fluid ounce as their base reference unit, but are using the multipliers for their Imperial quantities (20 and 40 respectively), rather than the US-specific multipliers (16 and 32).

There is also the question of whether to prefer one system over the other when choosing which should be assigned the canonical name for the unit. (ie. "GALLON" vs. "GALLON_US"). I submit that having a full set of units for each system would be appropriate and less confusing (ie. GALLON_US, QUART_US ... MINIM_US, as well as GALLON_IMPERIAL, QUART_IMPERIAL ... MINIM_IMPERIAL), though unfortunalty more verbose.

Finally, it strikes me that using two reference quantities in a single system (see GALLON and MINIM definitions above) runs the risk of discontinuities in calculations when approaching from either end. Wouldn't it be preferable to pick one reference quantity for a given system and derive all other units within that system from that one?

keilw commented

Thanks for mentioning that. Would you have a PR for the factors if something is currently wrong or mismatching?

About the names, especially the CLDR system is strictly based on ICU4J MeasureUnit where US Customary units have no prefix or postfix in most cases while others like "IMPERIAL" or "SCANDINAVIAN" do. We are consistent with the Unicode standard here.

In Imperial or USCustomary there are a few mostly internal constants that are currently not publicly visible like FLUID_OUNCE_UK while FLUID_OUNCE is visible. I think we could clarify some of the members in the Imperial system, but I am not really convinced if we should change all the members in USCustomary. Only after a careful survey. Plus there will always be different standards and conventions we follow like UCUM (there almost every unit is included hence most of them have prefixes or namespaces) or Unicode.

I'm completely new to GitHub, so have no notion of how to submit a PR or suchlike. We don't use GitHub where I work.

The chief item that I'm trying to draw attention to is the incorrect multipliers being used for PINT and QUART, given that they are intended to be USCustomary units.

I understand the reluctance to deviate from the ICU4J nomenclature, but in cases where there can be ambiguity, like the parallel systems of liquid volume units, it then becomes incumbent on the maintainers of this package to clearly state in the Javadoc for each ambiguously-named canonical unit (PINT, QUART, TEASPOON, etc) which system the unit belongs to.

keilw commented

Can you point to the correct ones or are the ones in USCustomary correct in your opinion?
The Unicode CLDR module should remain independent, so neither Imperial nor USCustomary shall be used as a dependency. You are correct with QUART, given ICU 68 introduced a QUART_IMPERIAL, but for PINT IMO that is a purely British term, hence unless ICU4J ever separates between PINT and PINT_IMPERIAL, the Pint is considered British and based on the Imperial multipliers.
HTH,
Werner

The US Customary package has a definition for PINT which is correctly set to equal 4 GILL_LIQUID, (and so equal to 16 FLUID_OUNCE, as I stated initially), so at least one source maintains that a PINT isn't a British-only quantity, as you seem to believe.

And despite what ICU4J decides or doesn't decide about the PINT, I submit that it's at least as important to be internally consistent as it is to "guess" at the intentions of an undecided external authority. A PINT that is equal to 20 US Ounces is just plain wrong: it doesn't match any official quantity.

One can argue that the unit naming convention ICU4J is following is to reserve the short-form name of a unit to always mean the US unit (where there is ambiguity), and employ a suffix to denote any alternate units of the same base name. Regrettably US-centric, IMO. But following that convention, one must conclude that a PINT is meant to be a US PINT (16 US Fluid Ounces), and that a QUART is similarly meant to be a US QUART (32 US Fluid Ounces).

Curiously, the US Customary package does not have a definition for QUART. This is odd, as things like milk are commonly sold in quarts here in the US.

keilw commented

As long as it doesn't also define a PINT_IMPERIAL we could only make an assumption to which one ICU4J means, they don't explain that and while a different METRIC_PINT is already defined (but that's 8 METRIC_CUP each 250ml) it is completely ambiguous. ICU4J does not care about these factors. Plus the US has even a DRY_PINT ICU4J also does not care at this point, so we make an assumption till it's changed in the ICU definition.

If you need to use both then you can always use USCustomary.
The current definition matches https://en.wikipedia.org/wiki/Pint and there the Imperial Pint is the first/primary entry with at least 2 US pints being second and several other variations of the Pint in different countries, most of them historic.

1 Pint = 20 imperial fluid ounces

In the United Kingdom, the imperial pint is the mandatory base unit for draught beer and cider.[4] Milk sold in returnable containers (such as glass bottles) may be sold by the pint alone and other goods may be sold by the pint if the equivalent metric measure is also given.
So it's a more official unit there than elsewhere, but given QUART_IMPERIAL was also just introduced as DRAFT in the latest ICU4J, I decided to slightly deviate here and offer a PINT_IMPERIAL in addition to the PINT. Hopefully ICU4J will also add it under that name. We keep ignoring their multiples although ICU4J recently came up with a "Complexity" (see unitsofmeasurement/indriya#323) and also introduced a SIPrefix enum as a draft, it has all those multiples like CENTILITER or DECILITER which we don't model this way since pretty much every unit can be combined with a prefix as long as the result makes sense. While ICU4J in its Complexity states, that "you cannot set the power or SI prefix of a compound unit." Which are pretty much all derived unit types, therefore that kind of restriction seems inappropriate.
ICU is more about spelling and composing words, if you take "meter-per-second" and combine it with "milli" that could result in funny concatinations where it's hard to distinguish between "milli(meter-per-second)" and "millimeter-per-second", but for arithmetics it should be possible.

Closing this as I think the only two factors were QUART and PINT and both got a UK equivalent now, even though the PINT goes beyond the current ICU/CLDR definition.

Thank you for following this through. Chasing down all the myriad variations of liquid measures around the world would be a trip down the proverbial rabbit hole, but you've handily addressed the two obvious errors that I pointed out.
Cheers.
PS. If you ever order a pint of beer in Canada, make sure you get the mandated Imperial pint, and not the smaller US pint.