`is_valid_name()`: What conforms a valid name?
alejandro-colomar opened this issue · 9 comments
There's is_valid_user_name()
:
alx@debian:~/src/shadow/shadow/wextra$ grepc -ntfd is_valid_user_name src/ lib*
lib/chkname.c:75:
bool is_valid_user_name (const char *name)
{
size_t maxlen;
/*
* User names length are limited by the kernel
*/
maxlen = sysconf(_SC_LOGIN_NAME_MAX);
if (strlen(name) > maxlen)
return false;
return is_valid_name (name);
}
Which limits the length of a valid user name by sysconf(_SC_LOGIN_NAME_MAX)
. But it calls is_valid_name()
:
alx@debian:~/src/shadow/shadow/wextra$ grepc -ntfd is_valid_name src/ lib*
lib/chkname.c:28:
static bool is_valid_name (const char *name)
{
if (allow_bad_names) {
return true;
}
/*
* User/group names must match gnu e-regex:
* [a-zA-Z0-9_.][a-zA-Z0-9_.-]{0,30}[a-zA-Z0-9_.$-]?
*
* as a non-POSIX, extension, allow "$" as the last char for
* sake of Samba 3.x "add machine script"
*
* Also do not allow fully numeric names or just "." or "..".
*/
int numeric;
if ('\0' == *name ||
('.' == *name && (('.' == name[1] && '\0' == name[2]) ||
'\0' == name[1])) ||
!((*name >= 'a' && *name <= 'z') ||
(*name >= 'A' && *name <= 'Z') ||
(*name >= '0' && *name <= '9') ||
*name == '_' ||
*name == '.')) {
return false;
}
numeric = isdigit(*name);
while ('\0' != *++name) {
if (!((*name >= 'a' && *name <= 'z') ||
(*name >= 'A' && *name <= 'Z') ||
(*name >= '0' && *name <= '9') ||
*name == '_' ||
*name == '.' ||
*name == '-' ||
(*name == '$' && name[1] == '\0')
)) {
return false;
}
numeric &= isdigit(*name);
}
return !numeric;
}
Which in a comment, says the name should match a regex that allows no more than 32 characters. I don't see the code actually enforcing that regex, as the loop seems unlimited. So, I have some questions:
- Should we be limiting valid user names to 32 characters?
- Or should we change that regex to specify what the code is really doing?
Or am I missing something?
Cc: @ikerexxe This was triggered by your suggestion of writing tests for this function. :)
See this program:
#include <stdio.h>
#include <unistd.h>
int
main(void)
{
printf("_SC_LOGIN_NAME_MAX: %ld\n", sysconf(_SC_LOGIN_NAME_MAX));
}
alx@debian:~/tmp$ cc -Wall -Wextra login_name_max.c
alx@debian:~/tmp$ ./a.out
_SC_LOGIN_NAME_MAX: 256
No 32.
AFAIK, POSIX doesn't specify a maximum length for the username, so I think it's better to update the comment to reflect what the code does.
Okay. I suggest using this BRE instead: [a-zA-Z0-9_.][a-zA-Z0-9_.-]*\$\?
Okay. I suggest using this BRE instead:
[a-zA-Z0-9_.][a-zA-Z0-9_.-]*\$\?
I think you added an extra \
at the end.
It should be [a-zA-Z0-9_.][a-zA-Z0-9_.-]*\$?
No, in a BRE, \?
is special, but ?
is a simple question mark. It would be in an ERE where it should be ?
.
Although it's true for \$
. I should have used $
, which in the middle of a BRE is just a dollar.
See grep(1):
Basic vs Extended Regular Expressions
In basic regular expressions the meta‐characters ?, +, {, |, (, and )
lose their special meaning; instead use the backslashed versions \?, \+,
\{, \|, \(, and \).
Now I remember that utmp(5) field ut_user
is limited by UT_NAMESIZE
, which is 32
. Does it mean something?
Now I remember that utmp(5) field
ut_user
is limited byUT_NAMESIZE
, which is32
. Does it mean something?
I don't think it means anything - utmp has other aging related issues and is set to be replaced anyway, right?
Right; just wanted to make sure I'm not missing anything.
I've been checking the code and the processing in utmp for the name will be done for the first 32 characters, anything that is beyond that isn't processed. This means that we could have a problem with users where the first 32 characters are the same, and the rest is different. I don't expect that to happen often, as users usually choose usernames shorter than 10 characters. So, yes, in theory we can have a problem, but in reality the probability of hitting this issue is low.