Unit reductions
Closed this issue · 4 comments
For a given unit, for example "5,280 feet" or "1,000 metres" it is often appropriate and expected to reduce the unit to a more commonly used version such as "1 mile" or "1 kilometre". This note explores an approach to implementation of unit reductions using Cldr data.
Measurement systems
In our example it would be expected that "feet" would be reduced using the "imperial" or "US" systems whereas "metres" would be reduced using the "metric" system. The unit data for Cldr does not maintain a per-unit mapping of unit name to measurement system.
For some units, such as digital units, have only a single system.
Locale and measurement system
Cldr does provide data a mapping of locale to measurement system so we can identify the preferred measurement system for a given locale. This would allow a unit reduction to additionally convert to the appropriate measurement system as well.
Automatic reduction
The intent of reduction is to produce a result in the range -10 < unit < +10. Therefore identify the reduction factor required and the target unit. Convert to the target unit. To identify the target unit we take into account the source unit's measurement system and attempt to find a reduction target in the same measurement system.
Example
iex> Cldr.Unit.reduce Cldr.Unit.new(:meter, 1100)
#Unit(:kilometer, 1.1)
# System conversion tries to keep a similar magnitude as the
# source unit. Convert to :US, :UK or :metric
iex> Cldr.Unit.convert_system Cldr.Unit(:meter, 1000), to: :US
#Unit(:yard, 1093.61)
# Round a unit. Rounding options are passed
# through to Cldr.Number
iex> Cldr.Unit.round Cldr.Unit(:yard, 1093.61)
#Unit(:yard, 1093.6)
Limitations
- Cldr data is not a comprehensive list of units
- The Digital category of units only supports base 10 conversions. Adding base 2 conversion would be also useful
I really like the idea for converting to a different measurement system. This would certainly be a nice feature for internationalization. A mapping of system -> unit shouldn't be a difficult thing to add I'd imagine.
The automatic reduction I feel tries to be to automatic:
E.g. I might want to reduce 300mm, but at least here in Germany it's uncommon to use decimeter so "30 cm" is actually better than "3 dm". A similar case would be with meter -> km. There's deca- and hectometer in between, but besides being unlikely in day-to-day use, they even seem to be absent in this package which means "300 meters" would be a unreachable result (300m isn't between [-10, 10], while 0.3km is).
Another example might be that a usage context is expecting sizes to be in meters or larger. So the 300mm example from above would better display as "0.3 meters", which is easier comparable to other lengths displayed in meters.
So I feel like a lot of use-cases would actually benefit from predefining the units used for reduction. I'd give the option to the user to either supply a single list of units or a list of units for each measurement system, where cldr would choose the measurement system by the locale. For differenciating between e.g. "300 meters" and "0.3 km" there could be an option like the format
ones of cldr or one which specifically prefers/refuse values smaller than 1.
Also I'm not sure how well reduce
would work in terms of api naming. For me it sounds way to much like what Enum does. I think the usecase is most often so show a "simpler", quicker readable version of a measurement so maybe something like simplify
or similar.
About the limitations: The first one is a limitiation you've to deal with anyways if you're using the package. And having base 2 for filesizes would indeed be a useful addition.
Thanks for the very helpful comments which, as usual, make a lot of sense.
You say:
So I feel like a lot of use-cases would actually benefit from predefining the units used for reduction. I'd give the option to the user to either supply a single list of units or a list of units for each measurement system,
Can you give me an example of what the api might look like (agree, reduce
might not be such a good idea. Decimal.reduce
uses it, but its context is less ambiguous.
In release 1.0 there is Cldr.Unit.convert/2
which allows conversion so you can already:
iex> Cldr.Unit.convert Cldr.Unit.new(:millimeter, 300), :centimeter
#Unit<:centimeter, 30.0>
But you suggest a list of alternatives which I understand conceptually but I'm not sure in practise how you're suggesting selecting amongst alternatives.
iex> units = [:centimeter, :meter]
iex> Cldr.Unit.convert_units Cldr.Unit.new(:millimeter, 3), units
#Unit<:centimeter, 0.3>
iex> Cldr.Unit.xyz Cldr.Unit.new(:millimeter, 3000), units
#Unit<:meter, 3>
iex> units = [:kilobyte, :megabyte, :gigabyte]
iex> Cldr.Unit.convert_units Cldr.Unit.new(:byte, 900), units
#Unit<:kilobyte, 0.9>
iex> Cldr.Unit.convert_units Cldr.Unit.new(:byte, 9_000_000_000_000), units
#Unit<:gigabyte, 9_000>
iex> metric = [:centimeter, :meter]
iex> uk = [:foot, :yard]
iex> us = [:foot, :yard]
iex> Cldr.Unit.convert_units_by_system Cldr.Unit.new(:millimeter, 3), metric: metric, uk: uk, us: us
#Unit<:foot, …>
With such a base api cldr units could also add "often used" lists of units. E.g. :byte..terabyte
will probably be the most used digital unit range. Also the smallest unit in the list would always be chosen (even for 0.…
values), while later units would be value < 1
, but I'm really not sure what would be the best way to threshold the switching to the next bigger unit. That would probably need some more real use-cases to really be determined.
I have pushed a new release and published version 2.2.0 which I believe addresses this issue. I'd welcome your feedback. Its also clear that the strategy of embedding conversion factors in the source isn't a good idea. In the next release I will convert the factors to json
which can be downloaded so that updates to the conversion factor tables does not require a new release.
Enhancements
This release is primarily about improving the conversion of units without introducing precision errors that accumulate for floats. The strategy is to define the conversion value between individual unit pairs.
Currently the implementation uses a static map. In order to give users a better experience a future release will allow for both specifying mappings as a parameter to Cldr.Unit.convert/2
and as compile time configuration options including the option to download conversion tables from the internet.
-
Direct conversions are now supported. For some calculations, the process of diving and multiplying by conversion factors produces an unexpected result. Some direct conversions are now defined which produce a more expected result.
-
In most cases, return integer values from conversion and decomposition when the originating unit value is also an integer