open-i18n/rust-unic

[ucd/bidi] BidiClass should use the long names

Closed this issue · 3 comments

CAD97 commented

The Bidi_Class_Values enum should use the long names of the bidi classes, for clarity and to fit in better with the rest of the ucd api and the Rust ecosystem.

This can probably be bikeshedded ad nauseum, but defaulting to the descriptive names seems the better idea and §5.8.1 Property Aliases tells us that the long symbolic names are the preferred aliases. (Cases like Age where we can provide a more meaningful struct rather than a enum excepted, of course.)

We could offer a alias mod or such which provides pub use bindings to the abbreviated symbolic name. PropertyValueAliases.txt could (in theory) be used to generate and/or test the aliases.

Good point and great idea! Having a mod that only aliases the names sounds like the best alternate to allow access to the same enum variant under different names, while supporting use ...::* for both formats.

For the mod name, I like the term abbr_names better, because it makes it clear what kind of alias the names are, and leave it open to have other sets of aliases, if needed in any case.

Then, it's a matter of the mod path. Do we want to have unic::ucd::BidiClass::abbr_names, or unic::ucd::bidi_class_abbr_names, or unic::ucd::bidi_class::abbr_names (we don't have unic::ucd::bidi_class at the moment)?

I don't know if there are any similar cases in the current Rust code to see what's the common practice. So, we can try whatever that's more intuitive and practical to use, and clear to implement. I think the first path is the best in these perspectives.

What do you think?

CAD97 commented

If the first is valid, it makes the most sense. Even if it's not actually that way, it feels like the abbr_names namespace is a child of BidiClass.

I just tested that, and I couldn't get it to work.

Because it's exposed currently as unic::ucd::BidiClass, maybe have the "pseudo enum" as unic::ucd::BidiClassAbbr:

pub enum Enum {
    LongName1,
    LongName2,
}
pub mod EnumAbbr {
    pub use super::Enum::LongName1 as L1;
    pub use super::Enum::LongName2 as L2;
}

The only place where this would break down is if you try to use EnumAbbr as a type parameter; you have to use the actual type. But as a namespace, this is transparently pretending to be the actual enum with different names.

Generally, the reason I usually prefer to keep things as children of a main type (struct/trait) is to make them easier to use having the main type imported (used).

Looks like having mod under impl is not something supported by the language, so unic::ucd::BidiClass::abbr_names out of question.

About using a sudo-type mod, I would prefer to avoid it because it has issues in two fronts: not following well-known practices in naming and requiring lint overrides, and, similarly, possibility of becoming confusing for humans, as you mentioned already.

IMO, we want the basics of this aliasing to be crystal clear for the users. We don't want anyone to think of the aliasing mod as a type, but just a namespace to import some shorthands easily—instead of creating the aliases themselves.

That said, I think we can change mod bidi_class to pub mod bidi_class in components/ucd/bidi/src/lib.rs, and have pub mod abbr_names {} inside it. This allows bidi_class.rs to be self-contained when using the abbr-names in the table files.

One thing to remember here is that every component can have more than one property type which may require aliasing, so, we need the hierarchy.

PS. Another option, for the future, would be using associated_consts feature, which just got stabilized and in about 10 weeks will be available in rustc.