Email validation- constraints on domain label and presence of unicode unhandled
devikasondhi opened this issue · 1 comments
devikasondhi commented
Hello,
I'm listing some scenarios where the is_email
fails:
- domain with localhost not accepted by is_email:
email@localhost
,email@[127.0.0.1]
are valid while the function returns False - unicode not handled- this should be valid but returns false:
test@domain.with.idn.tld.\\xe0\\xa4\\x89\\xe0\\xa4\\xa6\\xe0\\xa4\\xbe\\xe0\\xa4\\xb9\\xe0\\xa4\\xb0\\xe0\\xa4\\xa3.\\xe0\\xa4\\xaa\\xe0\\xa4\\xb0\\xe0\\xa5\\x80\\xe0\\xa4\\x95\\xe0\\xa5\\x8d\\xe0\\xa4\\xb7\\xe0\\xa4\\xbe
- domain labels can't begin or end in hyphens '-': These should be invalid but is_email gives true:
example@invalid-.com
andexample@-invalid.com
devikasondhi commented
Also, local part can contain ascii characters like !'/ (https://en.wikipedia.org/wiki/Email_address). This is not handled well by is_email
for input joe!/blow@apache.org
.
Further, is_email
gives False for input abc@school.school
. It seems there is a limit on the length of last domain label (not accepting longer than 4).