technoweenie/restful-authentication

Validation with Authentication.email_regex fails on some RFC5322-valid addresses

GGCrew opened this issue · 1 comments

Issue:
The current Authentication.email_regex code does not check for certain characters that are approved by RFC5322 part 3.2.3. Specifically, the code does not check for the following symbols: !, #, $, &, ', *, /, =, ?, ^, `, {, |, }, or ~.
http://tools.ietf.org/html/rfc5322#section-3.2.3

Example:
An RFC5322-valid email address of Tom.O'Mally@website.org will fail the current email_regex test.

Solution:
Expand the Authentication.email_regex code to it better supports the RFC5322 specification.

Personal Thoughts:
The current Authentication.email_regex code works for the vast majority of email addresses.
In a nutshell, the code looks for valid "words" and a few specific symbols: ., %, +, -
Adding the missing symbols is one option: email_name_regex = '[\w!#$%&.'+-/=?^{|}~]+'.freeze I think that adding the missing symbols and replacing the current "word" check with specific characters is a better option: email_name_regex = '[a-zA-Z0-9!#$%&.\'_+-\/=?^_{|}~]+'.freeze
The code would be more specific about complying with RFC5322 part 3.2.3, while also setting the stage for full RFC5322 compliance.
(There are a few more rules that are required for full compliance, items like an email address cannot start with a period (.), and an email address cannot contain two consecutive periods (..))
The use of escape slashes in my proposed solution was based on this info from Ruby Doc: http://www.ruby-doc.org/docs/ProgrammingRuby/html/language.html#UL
Specifically:

  • All characters except ., |, (, ), [, , ^, {, +, $, *, and ? match themselves. To match one of these characters, precede it with a backslash.
  • The characters |, (, ), [, ^, $, *, and ?, which have special meanings elsewhere in patterns, lose their special significance between brackets.

I am amused that the example email address is not properly parsed by GitHub Flavored Markdown.