Bikeshed: flag + corresponding getter
mathiasbynens opened this issue · 16 comments
Extracting the discussion from #2 (comment), if we want to gate the new syntax/semantics behind a new flag, there are two questions:
- What would the new flag be (which letter)? Some options are
v
(u
as written in classical Latin) orw
(double-u
). - What would the name of the corresponding getter on
RegExp.prototype
be?uniSet
?
Let the bikeshedding commence.
Slight preference for v
over w
. “v is the next u...”
(“Only” in English is w=“double u”. French w="double v", German w="veh". I might reserve w for whatever comes after v...)
Idea for the getter: extCharClass (for “extended”)
Or maybe uniCharClass.
+1 on v
Should expressions be /.../v
(v
implies/replaces u
) or /.../uv
(u
enables code points, and v
enables sets of strings)?
Should expressions be
/.../v
(v
implies/replacesu
) or/.../uv
(u
enables code points, andv
enables sets of strings)?
It should be /.../v
(v
implies/replaces u
) as in the current version of the proposal (linked from issue #12).
- No need to define the new features in non-Unicode mode
- No need to require two flags when the new one alone makes no sense, better to imply/subsume
- No need to define the new features in non-Unicode mode
We don't need to do that necessarily; /v
can just be invalid without /u
, a restriction we could lift later.
- No need to require two flags when the new one alone makes no sense, better to imply/subsume
Two flags makes it more explicit that this is BOTH a unicode regex AND one with nested string sets.
It's more explicit, but it is superfluous. I think /u alone will fall to the wayside, and people will just find it an annoyance. "Oh, you forgot to use /uv — you just used /v and that doesn't work by itself."
I also have a slight preference for just v
. It would imply u
while also switching over to the new []
syntax.
We can leave w
for any future grapheme modes if we choose to do those.
Any opinions on the corresponding getter name? Here’s an overview of the current ECMAScript RegExp flags & getters:
assert(/…/d.hasIndices);
assert(/…/g.global);
assert(/…/i.ignoreCase);
assert(/…/m.multiline);
assert(/…/s.dotAll);
assert(/…/u.unicode);
assert(/…/y.sticky);
What would we do for v
? Perhaps:
assert(/…/v.uniSet);
// or…
assert(/…/v.unicodeSet);
Now that I’ve written this down, I like unicodeSet
, as it enables both unicode
(u
) + set
s.
uniset sounds good, or maybe even just sets
One reason I kind of like two separate flags is because the new key can be named "stringSets" or similar, without needing to reference Unicode. But if we had one combined flag, we might just need to say "unicodeWithStringClasses".
I really think of this as ES Unicode Regex v2; it encompasses and extends what was there before with the /u flag.
Could we please discuss the name of the getter once more?
On its own, I agree "unicodeSet" makes sense. However, it's the same as the closely related ICU class UnicodeSet which has been around for 25-some years and supports a pattern string syntax similar to regex character classes, including string literals and most Unicode properties, yet with a different syntax, especially compared to this new proposal for regex set operators and string literals.
How about some of the other suggestions here?
Or "setOps", "stringSet", "classStringOps", ...?
Crazy idea: Could we change the "unicode" getter to return numeric value 2 instead of boolean false/true, when v
is set? Would expressions like if (unicode) then ...
still work?
.vunicode
.unicode2
Pending the resolution of #23, I prefer .stringSets
or .stringClasses
or something else with the word "string" since that is the main overarching feature that /v
is adding.
I’d like to propose resolving this bikeshed with v
for the flag letter and unicodeSet
for the getter name. Both parts of the getter name make sense, IMHO: unicode
→ enables the use of Unicode properties of strings, and Set
→ enables set notation. Let’s discuss during this week’s meeting.
During yesterday’s weekly sync we decided to proceed with v
as the flag name, and unicodeSets
(note: plural!) for the getter name. Closing this issue to mark its resolution. We’ll update the spec draft later.