unitsofmeasurement/uom-systems

Units in `USCustomary` missing labels and/or name

GregJohnStewart opened this issue · 13 comments

There are a number of static units that are missing name and symbol values. I can probably go in and add them myself and make a PR, but wanted the consensus/ go-ahead from you guys first.

Examples (from the units I currently directly use):

  • USCustomary.INCH
  • ...

There are way more, seems like most of the ones I use are missing at least their symbol or name.

My use case would greatly benefit from them all being present... toString() has proved not to be sufficient, and I'd rather not have to re-declare a bunch of static members to get this functionality.

To be clear, I am building a system that allows a user to choose their own units, and thus needs proper display of the name and symbol of the unit. toString() gets the point across, but is not enough.

For now, as a workaround I am currently doing the following. Hacky but it is what it is:

	private static <Q extends Quantity<Q>> Unit<Q> getUnitWithNameSymbol(
		Unit<Q> unit,
		String nameIfNone,
		String symbolIfNone
	) throws NoSuchFieldException, IllegalAccessException {
		@SuppressWarnings("unchecked")
		Class<? extends Unit<Q>> clazz = (Class<? extends Unit<Q>>) unit.getClass();
		
		if (unit.getName() == null || unit.getName().isBlank()) {
			Field f1 = clazz.getSuperclass().getDeclaredField("name");
			f1.setAccessible(true);
			f1.set(unit, nameIfNone);
			f1.setAccessible(false);
		}
		if (unit.getSymbol() == null) {
			Field f1 = clazz.getSuperclass().getDeclaredField("symbol");
			f1.setAccessible(true);
			f1.set(unit, symbolIfNone);
			f1.setAccessible(false);
		}
		
		return unit;
	}

	//...

	// usage
	getUnitWithNameSymbol(Units.GRAM, "Gram", "g"),

In my implementation, I keep a list of 'relevant' units that I keep these in that my app supports. You can find the full implementation here: https://github.com/Epic-Breakfast-Productions/OpenQuarterMaster/blob/main/software/libs/open-qm-core/src/main/java/com/ebp/openQuarterMaster/lib/core/UnitUtils.java

keilw commented

I can't see why all the reflection would be necessary? Maybe @desruisseaux gets some ideas from that or how it could be used?

Also please point units in USCustomary that do not already have both symbol and name via private static <U extends Unit<?>> U addUnit(U unit, String name, String label) ?

INCH is probably a rare exception where only the symbol is added,

I did not verified in source code, but the call to setAccessible(true) suggests that the purpose of the reflection is to read a private field.

keilw commented

neither "name" nor "symbol" are private at least not where extensions to AbstractSystemOfUnits call the addUnit() methods.
If say the name of INCH is missing, we can add that based on an issue (including this one) but I fail to understand how the code snipped made sense because it more or less reinvents pretty much everything USCustomary already declares.

@keilw Reflection is necessary as a direct user of the api, unless there is an easier way. I am unaware of any way to set name or symbol after the object is created. If there is an easy way to copy a unit object while specifying name/symbol I'd like to know.

I'm fairly certain name and symbol are private fields; their getters are not private (with no setters). This is probably as it should be, as it makes sense for Unit to be an immutable object.

At any rate, all that reflection is my personal workaround until all the original static units in the library get updated with proper name/symbols. That is the initial intent of this issue, to get names/symbols for everything. Most static units in USCustomary don't have either name, symbol, or both

keilw commented

@GregJohnStewart Yes Unit is supposed to be immutable, and the symbol is only for a small fraction of units because Transformed units have no symbol. That's where the label comes into play, putting a symbol on them goes against parts of the framework especially formatting and you may get strange results trying to format a TransformedUnit where you put a symbol on it via reflection.
The only part of that UnitUtils class that seems unproblematic is the desire to collect ALL known units for a particular quantity type like Mass etc. with a helper string (somewhat similar to e.g. ICU4J) because until "Valhalla" is fully done (which I just heard at JavaLand it may well take till ~2030 ;-/) to offer generic type information at runtime, it is not possible at runtime to know that e.g. INCH is of type Length anymore. I would probably store it via the Quantity class and not the name of that class, but the idea seems legitimate, while manipulating the value of getSymbol() in units that by definition just have no symbol (that's why the "label" was invented in UnitFormat because for some use cases the label could even be locale-sensitive) gets you onto a slippery path.

I would normally accept that transformed units have no symbol, etc, but in this case we are specifically describing standardized units. A user of the library expects these to be fully described. If the units aren't fully described, then that makes the library much less helpful and usable.

Also, through my testing it looks like the transformed units do actually keep their dimensions:
Testing unit: in, name="Inch", symbol="in", dimension="[L]"

		log.info(
			"Testing unit: {}, name=\"{}\", symbol=\"{}\", dimension=\"{}\"",
			unit,
			unit.getName(),
			unit.getSymbol(),
			unit.getDimension()
		);

All the units I have looked at have seem to have a dimension to them.

keilw commented

That depends on how they are transformed or multiplied, if INCH was derived by multiplying or dividing METRE then of course the dimension is kept the same, otherwise it would be something else like "[L]/...", but except those that are explicitly Dimensionless all units are supposed to have a dimension.

The units are fully described and "label" in this case is a more flexible visual representation while only certain units like base units have a symbol. The standard is based around such principles therefore changing that would be at everyone's risk, there are several assumptions like formatting based on that, so the symbol has a more narrow purpose while every unit must produce a label, even if it may be a blank string for very special cases.

It seems super odd to me that there's a case where one wouldn't want to assign a human readable name and symbol. Even if one derives one unit from another, it is typically to get to another standardized unit. One doesn't make arbitrary nameless/ symboless units normally, IMO (American news units aside ("The meteor was half a Michael Jackson in diameter")).

Even so, I would still argue that since these are standard units described from the library, we should find a way to get them all names/symbols. If that means tweaking the spec, then so be it.

Until then, use cases such as mine will require reflection as I have described above. For the record, I don't see any adverse effects from formatting (though perhaps I just don't hit the area that issues would show themselves) after injecting the name/symbols.

keilw commented

But the label is the human-readable representation, you'll get something for every unit calling toString(). And while that by default is also the locale-agnostic version, it is up to the UnitFormat implementation if you automatically translate it to either"kpH", "km/u" (Dutch) or "km/t" (Danish).

See "Transformed units have no symbol. But like any other units, they may have labels attached to them" and in practice they always do because the default UnitFormat implementations provided with the RI won't accept null as a label.

So the unit formatting makes sense, but raises the question of why are name/symbol included in the first place then?

Also, having a little trouble nailing down where a formatter exists to get the name/symbol strings from a given Unit, happen to have a point to get this functionality?

keilw commented

The symbol has a different purpose for each unit. A TransformedUnit gets its label from the chain of all converters involved (the individual symbols plus operators) that's why it has no "symbol" of its own but a "collection" of symbols if you want.
While a BaseUnit or AnnotatedUnit have symbols,
I am just adding two JUnit tests checking all of USCustomary for the toString() representation to be always present and same for getName(). There are a few cases like RANKINE or FAHRENHEIT where the name is not set. Will fix that so the tests pass.

Regarding the formatting, I guess there is room for improvement but please feel free to suggest improvements not here but in the RI Indriya. I noticed a somewhat related ticket about missing names: unitsofmeasurement/indriya#367 but it is not really about formatting and an approach could be adding a pattern similar to SimpleQuantityFormat or (the inspiration for also the name) SimpleDateFormat.

keilw commented

Now all units in USCustomary have names, for additional options of SimpleUnitFormat, etc. please raise another ticket for the RI.