Regex \h and \v - confusing definitions
Opened this issue · 1 comments
From https://perldoc.perl.org/perlre.html:
\v [3] Vertical whitespace
\V [3] Not vertical whitespace
\h [3] Horizontal whitespace
\H [3] Not horizontal whitespace
[3] See Unicode Character Properties in perlunicode https://perldoc.perl.org/perlunicode.html#Unicode-Character-Properties for details
From https://perldoc.perl.org/perlunicode.html:
\p{Blank} This is the same as \h and \p{HorizSpace} : A character that changes the spacing horizontally.
\p{HorizSpace} This is the same as \h and \p{Blank} : a character that changes the spacing horizontally.
\p{VertSpace} This is the same as \v : A character that changes the spacing vertically.
"Changes the spacing" can mean five different things (has nonzero width? tallest character on the line, i.e. defines the line's height? starts a new line? is whitespace? is whitespace but does not start a new line?), and it's not clear which ones are correct.
perlunicode.html says that some classes are synonyms of Unicode classes, but none of "spacing", "blank", "horizspace" and "vertspace" seem to be defined character classes in http://www.unicode.org/reports/tr44/.
Am I missing something, or are those definitions incomplete?
The contents of documentation are the responsibility of the perl source: https://github.com/Perl/perl5