whatwg/html

Proposal: HTML passwordrules attribute

dbates-wk opened this issue Β· 58 comments

HTML passwordrules attribute

Motivation

Some user agents offer to generate random per-site passwords on behalf of the user. Safari has built-in support for this, and add-on password managers such as 1Password add this functionality. This feature improves user security by guaranteeing high-entropy passwords and avoiding reuse of the same password on multiple sites.

One challenge with this approach is that sites have different rules for valid passwords. Many sites require characters from specific sets to be present, or have other constraints. The best known solution is to have a generator rule that matches the password requirements of many sites, plus a curated list of per-site quirks for sites with unusual requirements.

A better solution would be for the website to express its password requirements in machine-readable form, and in a format that is suited for use with a generation algorithm. While the pattern attribute allows expressing many value constraints, it's very hard to use it to drive a generator. It's also tricky to express many popular password constraints (such as a limit on the number of consecutive repeated characters) in a regexp.

Proposed Solution

We propose a new content attribute on the HTML input element called passwordrules and define a mini syntax for web authors to use to express their requirements (rules). We describe how a user agent will makes use of these rules and the minimum requirements for the user agent to honor these rules below.

Extensions to HTML

We propose the following new content attribute be added to the HTML input element:

	passwordrules

Using the passwordrules attribute

The passwordrules attribute, when specified, describes the set of extra restrictions on the value of the element's value attribute that a user agent must consider when generating a password and performing client-side form validation. Its value is a semicolon delimited string of one or more property/value pairs and has the form:

required: (<identifier> | <character-class>), ..., (<identifier> | <character-class>); allowed: (<identifier> | <character-class>), ..., (<identifier> | <character-class>); max-consecutive: <non-negative-integer>

An <identifier> must case-insensitively match one of the following strings: upper, lower, digit, special, ascii-printable, and unicode. These identifiers correspond to the set of ASCII uppercase letters (A-Z), lowercase letters (a-z), digits (0-9), all other ASCII printable characters - including the space character - (-~!@#$%^&*_+=`|(){}[:;"'<>,.? ]), all ASCII printable characters, and all Unicode characters, respectively.

A <character-class> is a custom characters class.

A <non-negative-integer> is a valid non-negative integer.

The missing value default for passwordrules is allowed: ascii-printable. There is no invalid value default.

The values of multiple required/allowed properties are concatenated together and multiple max-consecutive properties behave as if a single max-consecutive property was specified whose value is the minimum of all max-consecutive properties. Duplicate property values are ignored. Specifying multiple character classes is equivalent to specifying one character class that represents the union of the characters in all character classes. Empty character classes are ignored. Properties without a value are ignored. The following examples illustrate the aforementioned equivalences:

required: upper; required: lower <=> required: upper, lower
allowed: upper; allowed: lower <=> allowed: upper, lower
max-consecutive: 4; max-consecutive: 2 <=> max-consecutive: 2
required: upper, lower, upper <=> required: upper, lower
required: [abc], [def] <=> required: [abcdef]
allowed: upper, [] <=> allowed: upper
required: ; allowed: upper <=> allowed: upper

NOTE: The expression required: upper; required: lower is NOT equivalent to required: upper, lower. See Requiring that a password contain certain characters.

If you do not specify the max-consecutive property then it defaults to being unbounded. That is, the user agent can generate a password with one or more arbitrary length runs of the same character (e.g. ooops).

If you specify the required property and do not specify the allowed property then the user agent will infer the value of the allowed property according to the rules in How a user agent determines the allowed characters.

For example, to require a password have at least 8 characters consisting of a mix of uppercase and lowercase letters, at least one number, and at most two consecutive characters, add this to your markup:

<input type="password" minlength="8" passwordrules="required: upper; required: lower; required: digit; max-consecutive: 2">

To require at least one digit or one of -().&@?'#,/"+ (not both), add this to your markup:

<input type="password" minlength="8" passwordrules="required: upper; required: lower; required: digit, [-().&@?'#,/&quot;+]; max-consecutive: 2">

Or to require at least one of -().&@?'#,/"+, add this to your markup:

<input type="password" minlength="8" passwordrules="required: upper; required: lower; required: digit; required: [-().&@?'#,/&quot;+]; max-consecutive: 2">

Alternatively, to optionally allow one of -().&@?'#,/"+, add this to your markup:

<input type="password" minlength="8" passwordrules="required: upper; required: lower; required: digit; allowed: [-().&@?'#,/&quot;+]; max-consecutive: 2">

Another example, to allow a password to contain an arbitrary mix of letters, numbers, and -().&@?'#,/"+, add this to your markup:

<input type="password" minlength="8" passwordrules="allowed: upper, lower, digit, [-().&@?'#,/&quot;+]">

WARNING: With the exception of the NOTE below, each property/value pair reduces the entropy of a user agent generated password and makes the password more likely to be guessed or brute-forced. The more characters that are required the more likely the user agent generated password can be guessed or brute-forced.

NOTE: Setting the passwordrules attribute to allowed: unicode provides the most entropy for a user agent generated password. Omitting the passwordrules attribute or setting it to the empty string provides the second most entropy for a user agent generated password.

Custom character classes

A custom character class is a list of ASCII characters that are surrounded by square brackets (e.g. [abc]). Any non-ASCII printable characters in the set are ignored. The dash character (-) is reserved as a special character. To list '-' as a literal character it must appear immediately after the opening square bracket '['. The right square bracket (]) is also reserved as a special character. To list ']' as a literal character it must appear immediately before the closing square bracket ']'.

Specifying the characters allowed to be in a password

The value of the allowed property is a comma-separated list of character class identifiers or custom character classes, or both. Each custom characters class represents a set of characters that are allowed to be in the generated password. For example, if the allowed property is set to [*]] then the generated password is allowed to contain ']' and '*', but it is not allowed to contain '[' among other non-listed characters. If the allowed property is set to digit, [@!] then the generated password is allowed to contain one or more ASCII digits, one or more '@'s and one or more '!'s, but it is not allowed to contain '[' among other non-listed characters.

Requiring that a password contain certain characters

You can require that a password contain certain characters or classes of characters by setting the value of the required property to a comma-separated list of character class identifiers or custom character classes, or both. For example, if the required property is set to upper, digit then the user agent MUST generate a password that contains at least one ASCII uppercase letter and at least one digit. If required is set to upper, [@!] then the user agent MUST generate a password that contains at least one ASCII uppercase letter and either '@' or '!'.
A user agent must generate a password that contains at least one character from each required property. For example, if the passwordrules attribute is set to required: upper; required: digit then the user agent MUST generate a password that contains at least ASCII uppercase letter and at least one digit. If there is a single required property that is set to upper, digit then the user agent MUST generate a password that contains at least one ASCII uppercase letter or at least one digit. If there is a single required property that is set to upper, [@!] then the user agent MUST generate a password that contains at least one ASCII uppercase letter or '@' or '!'.

Limiting the number of consecutive repeated characters

The value of max-consecutive is a non-negative integer that represents the maximum length of a run of consecutive identical characters that can be present in the generated password. For example, set max-consecutive to 2 to disallow a user agent from generating a password that contains a run of more than 2 of the same character (e.g. "ooops" - contains three consecutive o's).

How a user agent determines the allowed characters

The set of required characters MUST always be a subset of the set of allowed characters. If the value of passwordrules violates this constraint then the user agent MUST adjust the value of allowed to satisfy it. The following implications immediately fall out from this constraint:

  1. If you specify the required property and do not specify the allowed property then the allowed property is inferred to be the value of the required property.
  2. If you set both the required property and the allowed property then the user agent behaves as if the allowed property were set to the union of the value of the allowed property and the value of the required property. For example, if the required property is set to lower and the allowed property is set to [abc0123] then the user agent MUST behave as if the allowed property were set to lower, [0123]. Another example, if the required property is set to lower and the allowed property is set to upper then the user agent MUST behave as if the allowed property were set to lower, upper.
  3. If neither the required property nor the allowed property are specified then the user agent behaves as if the allowed property was set to ascii-printable.

How a user agent generates a password based on passwordrules

A user agent will generate a password using an algorithm or heuristic of its choice that respects the following attributes of a password element (not necessarily in order): minlength, maxlength, and passwordrules. If the set of constraints imposed by the aforementioned attributes fail to meet the following minimum restrictions then they are considered nonconforming and the user agent is REQUIRED to ignore them:

  1. The maximum password length cannot be less than 12.
  2. Allowed characters must consist of at least two of the following character classes: ASCII uppercase letter, ASCII lowercase letters, digits.

Characters in the generated password MUST be expressed in Normalization Form C and must conform to the following UAX31 profile:

Interaction with client-side form validation

It is not recommended to specify both the pattern attribute and the passwordrules attribute.

The passwordrules attributes participates in constraint validation. If the element's value attribute does not satisfy the criterion specified by the value of the passwordrules attribute then the element is in the "suffering from a passwordrules mismatch" validity state and the element is invalid for the purposes of constraint validation.

Confirmation password field

Some web pages have both a password field ("primary password field") and a confirmation password field. The passwordrules attribute needs only to appear on one of these fields. If both fields have the passwordrules attribute then you must ensure that they have the same value. Otherwise, the user agent will behave as if both fields have set their passwordrules attribute to the result of the union of both field's required property (if any) and the intersection of both field's allowed property (if any) after simplifying the passwordrules attribute of both fields according to rules in Using the passwordrules attribute. For example, if a page contains the following markup:

<input type="password" name="password" minlength="8" passwordrules="required: upper, lower, digit, [-().&@?'#,/&quot;+]; max-consecutive: 2">
<input type="password" name="confirmation-password" minlength="8" passwordrules="required: upper; allowed: [!]; max-consecutive: 3">

Then the user agent must behave as if the markup was:

<input type="password" name="password" minlength="8" passwordrules="required: upper, lower, digit, [-().&@?'#,/&quot;+]; max-consecutive: 2">
<input type="password" name="confirmation-password" minlength="8" passwordrules="required: upper, lower, digit, [-().&@?'#,/&quot;+]; max-consecutive: 2">

possible refinement, since a perverse required could apparently shrink the space a lot: "the password constraints will be ignored if they would reduce the number of possible passwords below 2**60" or something like that -- otherwise I think there are some very low-entropy edge-cases that come about due to too many required elements effectively turning the generated password into a mere permutation of those elements

@Bsitter Good catch! I agree that it's bad if perverse password rules limit the number of possibilities to an overly low number.

For an implementation requirement like "the password constraints will be ignored if they would reduce the number of possible passwords below 2**60", the standard would need to include an algorithm to calculate the number of possible passwords to enable interoperable behavior. If browsers calculated it slightly differently, it would be a significant interop problem.

We tried to evade the need for a full entropy calculation by having higher-level rules to ensure a wide enough range of passwords. Specifically, passwordrules must be ignored if the max length too low, or the set of allowed characters is too small a range. You are right that excessive "required" directives could also overly limit the passwords. In the spirit of the easier to determine rule for rejecting overly restrictive "passwordrules", how about setting an upper limit on the number of "required" directives that may be present?

First of all, I really like this! Giving declarative credential generation more love is great.

My main worry here is the complexity of the attribute and requiring another custom parser for it. Can we consolidate that with something somehow? Perhaps just having more attributes or going full JSON?

Should we also integrate this with https://w3c.github.io/webappsec-credential-management/ somehow? I understand that has adoption due to WebAuthn so presumably it's something that'll stick around and we need to account for?

(The other thing we should include in the examples advocating this technique is autocomplete=current-password and autocomplete=new-password. This is only needed for the latter (and only for the first of its kind on a page, per OP).)

@othermaciej indeed, and I actually considered including such a wrinkle on my original comment but realized that some required values aren't particularly bad this way while others are, and evaluating them this way is a little problematic (approaching the complexity of overall entropy computation.) A very rough approximation might be: maxlength may be no less than 12 + the number of "trivial" required elements. To be considered trivial, a required element must permit no more than 35 possibilities in the printable ASCII range. correction: cutoff was supposed to be 31 - this means allowing punctuation as a non-trivial required element, which should satisfy lots of existing rules without undue penalty

Another question: is character class merging the intended behavior for required? It seems like it shouldn't be, but this suggests otherwise:

Specifying multiple character classes is equivalent to specifying one character class that represents the union of the characters in all character classes

Otherwise required only ever has at most one character of influence

@bsittler I think what this proposal says is right for allowed but probably not for required. required: upper, lower should require at least one ASCII alphabetic character, while required: upper; required: lower should require at least one uppercase character and at least one lowercase. I am not sure what @dbates-wk 's intent was when writing this but I think that's how it should work. For allowed, multiple directives and a single directive with commas would be equivalent under any reasonable interpretation.

On the "trivial character class" rule, that makes sense to me as an approach, but the specific proposal would require a minimum length of 15 instead of 12 for passwords with the typical "must include at least one uppercase, at least one lowercase, at least one number" restriction. If in addition a special character is required, that would be a minimum length of 16. That seems excessive, as adequate entropy is possible for 12-character passwords with either of these common restrictions.

@annevk We care more about the capabilities than the syntax. That said:

  • Multiple attributes is possible, but it would result in three attributes of which two have (similar) nontrivial syntax, so it would not avoid the need for an extra mini-parser.
  • JSON seems like overkill.
  • Credential Management is programmatic, while this is declarative (and that's part of the use case). So not clear how they could be integrated. I don't think the parts of Credential Management that aren't required for WebAuthN are likely to get wide traction.

I disagree with the fundamental premise of this. :( Restrictions on passwords beyond minimum length (and maybe a large maximum length) are all fundamentally bad, particularly restricted characters - such restrictions indicate that the site is storing passwords very badly (in plaintext, with bad escaping practices when interacting with their database). Required characters are also generally a bad restriction - it's much better to simply increase the minimum length and let people use whatever characters they (or their pw generators) want.

Do we really want to be adding a feature whose primary use-case is making it easier for already-broken sites to continue being broken?

Restrictions on passwords are indeed bad. I agree it would be best if they went away. But it also seems unlikely they will go away any time soon.

Password generators are extremely good. About the safest thing anyone can do for their online security is to use a unique randomly generated password for each site.

If password generators can't work with the existing password restrictions of websites, then that leads to a bad user experience (user counts on generator, then the site rejects their password) and poor security (user makes up a weak or reused password on the spot). The current state of the art is to maintain a list of site-specific quirks to get the password generator to do its job right. Safari has a pretty extensive set. We'd like password generators (including ours) to be able to do a good job without needing a quirks list.

Thus, even though password restrictions are likely harmful on net (other than minimum length), the most practical harm reduction is for sites with restrictions to make it obvious and machine readable what those restrictions are.

@annevk:

Restrictions on passwords beyond minimum length (and maybe a large maximum length) are all fundamentally bad, particularly restricted characters - such restrictions indicate that the site is storing passwords very badly (in plaintext, with bad escaping practices when interacting with their database). Required characters are also generally a bad restriction - it's much better to simply increase the minimum length and let people use whatever characters they (or their pw generators) want.

If the WHATWG decides to add a passwordrules attribute, the attribute’s specification could include an informative note stressing that password restrictions are Useless and Bad and that storing passwords as plain text is Very Bad. This news still has not percolated through to many IT organizations; any opportunity to forcefully communicate this to them is valuable. As long as password restrictions remain a common practice on the web, for better and for worse, the new attribute could be a good opportunity for the WHATWG to emphatically recommend that web developers not use password restrictions at all.

It seems like many, but not all, use cases in the OP can be covered by the existing pattern attribute. (For example, specifying allowed or disallowed characters.) Could we consider scoping this down to only the use cases that cannot be accomplished with today's technology?

Another question: is character class merging the intended behavior for required? It seems like it shouldn't be, but this suggests otherwise:

Specifying multiple character classes is equivalent to specifying one character class that represents the union of the characters in all character classes
Otherwise required only ever has at most one character of influence

You're right! I updated my proposal to remove this sentence (indicated by a strikethrough).

If the WHATWG decides to add a passwordrules attribute, the attribute’s specification could include an informative note stressing that password restrictions are Useless

I take it you feel that the WARNING paragraph in the proposal is not sufficient?

It seems like many, but not all, use cases in the OP can be covered by the existing pattern attribute. (For example, specifying allowed or disallowed characters.) Could we consider scoping this down to only the use cases that cannot be accomplished with today's technology?

Although some of the use cases could be accomplished with today's technology they cannot be accomplished easily or succinctly. For instance, consider the following common variant of the first example in the proposal that disregards the consecutive character requirement: a password that has least 8 characters consisting of a mix of uppercase and lowercase letters, at least one number. This can be accomplished with today's technology. It is non-trivial to do so. Accomplishing this task or variants of it are exemplified by the regexps in https://stackoverflow.com/questions/19605150/regex-for-password-must-contain-at-least-eight-characters-at-least-one-number-a.

@js-choi I don't think password restrictions are related to storing passwords in plaintext. They are either because of dumb legacy system limitations (max lengths, very restricted set of allowed characters), actually good (minimum length limit) or well-intentioned attempts to get users to make handmade passwords that are resilient to guessing or offline dictionary attack against a leaked hashed password database (for example, the popular "one letter, one number, one special" requirement).

@othermaciej: I agree insofar that many cases of password restrictions are due to dumb legacy system limitations or well-intentioned encouragement of better handmade passwords. I was mostly responding to @tabatkins’s saying that "such restrictions indicate that the site is storing passwords very badly (in plaintext, with bad escaping practices when interacting with their database)”, which may well also be sometimes true.

@dbates-wk: The currently worded warning:

WARNING: With the exception of the NOTE below, each property/value pair reduces the entropy of a user agent generated password and makes the password more likely to be guessed or brute-forced. The more characters that are required the more likely the user agent generated password can be guessed or brute-forced.

…is not quite forceful or empathetic in discouraging password restrictions in general, a discouragement that @tabatkins probably believes ought to be done. I personally am sympathetic to his view, but I am also sympathetic to making usability better for users of password managers. From my own field, bad password restrictions are a particularly pernicious problem in healthcare/clinical applications.

Addressing password restrictions at all may be seen by developers as a general statement from WHATWG on its disposition toward password restrictions, for better or for worse. Care should therefore be crafted in how its specification is worded: it probably would not hurt for that warning above to be more forceful and empathetic against password restrictions in general. Such force may somewhat ameliorate @tabatkins’s general reservations against addressing password restrictions at all.

@domenic We thought about just using the pattern attribute, but there are two challenges:

(1) Consider a common limitation like: "must contain at least one letter and one at least number, and may contain !@#$%^&*()_+-=". It's possible to do with a regexp but it's pretty non-obvious.

Here is the clearest regexp I could come up with that implements this rule: (([A-Za-z]|[0-9]|[-!@#$%^&*()_+=])*[A-Za-z]([A-Za-z]|[0-9]|[-!@#$%^&*()_+=])*[0-9]([A-Za-z]|[0-9]|[-!@#$%^&*()_+=])*)|(([A-Za-z]|[0-9]|[-!@#$%^&*()_+=])*[0-9]([A-Za-z]|[0-9]|[-!@#$%^&*()_+=])*[A-Za-z]([A-Za-z]|[0-9]|[-!@#$%^&*()_+=])*). That is a lot harder to write correctly and a lot harder to understand than "required: upper, lower; required: number; allowed: [-!@#$%^&*()_+=]". Using regexps to represent this rule is likely too hard for web developers to do correctly.

(2) In theory it's possible to use a regexp to drive generation rather than matching, but it's pretty hard. Getting a password generator to produce a uniformly random password that matches an arbitrary regexp is possible in theory, but way harder than getting it to produce a random password that matches rules of the type that passwordrules supports. Also, password generators may try to be clever and make passwords that are easy to type (for cases where you have to log into another device without benefit of autofill), at least when rules are flexible enough to allow it. It's straightforward to do this with the limited kinds of rules that passwordrules supports but infeasible to do with a generator that can be driven by a regexp. You could argue that maybe only a subset of regexps should be supported, but how do you decide what subset? It can take a very complex regexp just to represent a simple rule. It's also not very good for web devs if they are supposed to use pattern but must be very careful what they put in it or it will be ignored.

So even though passwordrules is technically redundant with pattern, it's still a practical addition because it makes the password requirements easier to write, easier to understand, easier to verify, and easier to feed to a generator. This is what made us conclude that we need a new feature and can't just reuse pattern.

@js-choi I am fine with having a more assertive warning. I think the wording in the spec will have very little influence on prevalence of password restrictions one way or the other, but we should do our best to avoid proliferating restrictions even a little.

Zirro commented

Are there many sites out there which restrict passwords in this way, yet still receive enough attention from developers who would be likely to add this attribute? It's anecdotal, but the only sites on which I've encountered restrictive password limitations are ones which have not seen updates for years.

I'm also a bit concerned that adding an attribute - despite warnings in the spec - might encourage more sites to introduce restrictions. Do you have data showing how common password restrictions are today, and if their usage is declining? It seems like this might become a smaller problem within a few years, as older systems get replaced.

Zirro commented

Getting a password generator to produce a uniformly random password that matches an arbitrary regexp is possible in theory...

It seems like it should be possible to cover 99% of cases by generating ~50 passwords according to different rules and transparently match them against the pattern attribute until you find the most preferable one which is allowed. How long is the list of sites with unusual rules, and how quirky are they?

Unless we foresee other uses for it besides covering the remaining cases for password rules, the added complexity of introducing a unique syntax ought to be avoided.

@Zirro Many sites have password restrictions, including ones that are popular and actively maintained.

For example, here's the restrictions from etrade.com (as stated by the site):

  • Needs 8-32 characters with no spaces
  • Needs at least one number
  • Needs at least one uppercase and one lowercase letter
  • Cannot be the same as your user ID

Other sites have hidden restrictions. They don't name any up front, but reject some passwords in practice.

It seems like it should be possible to cover 99% of cases by generating ~50 passwords according to different rules and transparently match them against the pattern attribute until you find the most preferable one which is allowed.

This is inefficient and likely to still fail in edge cases, so I doubt we'd adopt this over a quirks list. Also, the bigger problem with pattern is that it's very hard to write regexps that correctly implement many popular password limitations. Site authors could use pattern today but they don't.

While I am sympathetic to the desire to avoid technically redundant features, I think framing password rules in a more direct way will solve a real practical problem that can't be solved just by pushing existing features harder.

@othermaciej:

@bsittler I think what this proposal says is right for allowed but probably not for required. required: upper, lower should require at least one ASCII alphabetic character, while required: upper; required: lower should require at least one uppercase character and at least one lowercase.

Fixed this up to match your expectation.

I updated the proposal. With the exception of the example sections, I demarcated removals from- and additions to- the original proposal using strikethrough and italic, respectively.

@dbates-wk the updates are improvements from my point of view. A few issues still concern me:

  • So far as I can tell there is no limit on how many narrow-character-class required: limitations a site can impose, which means:
    • unacceptable entropy-reduction is possible; I believe target minimum entropy levels for generators should be clearly stated in the proposal, even if no formula is given to compute the level based on the passwordrules
    • in these cases selecting conforming password candidates from a less-restricted candidate list may take too long to terminate - e.g. a list generated naively based on random selection (for each character) from the printable-ASCII subset of allowed:; to overcome this I think the proposal needs to include at least a rough outline or pseudocode for a conforming generator with guaranteed termination
  • The character class syntax differs from JS regular expressions; is this intentional? If so, it should be noted more prominently; if not, the gaps should be closed
  • As an example: how would a requirement for one of [, - or ] be expressed? I believe in JS regular expressions it would be [-[\]]
prlbr commented

@othermaciej

Many sites have password restrictions, including ones that are popular and actively maintained.

The actively maintained sites could be educated to lessen their technical restrictions. They could still give their users recommendations for chosing a good password without actually reducing the space of possible passwords.

@bsittler

So far as I can tell there is no limit on how many narrow-character-class required: limitations a site can impose, which means:
unacceptable entropy-reduction is possible; I believe target minimum entropy levels for generators should be clearly stated in the proposal, even if no formula is given to compute the level based on the passwordrules

Do you have a particular minimum entropy level in mind? Otherwise, I will think about it and get back to you.

in these cases selecting conforming password candidates from a less-restricted candidate list may take too long to terminate - e.g. a list generated naively based on random selection (for each character) from the printable-ASCII subset of allowed:; to overcome this I think the proposal needs to include at least a rough outline or pseudocode for a conforming generator with guaranteed termination

The character class syntax differs from JS regular expressions; is this intentional?

Yes, this is intentional because we do not need to represent arbitrary character ranges given that a custom character class syntax is designed to only contain ASCII printable characters and we expose literals to represent all the common ASCII printable character ranges (e.g. "lowercase' is equivalent to regex [a-z]+). The current proposal reserves '-' should we need to support arbitrary character ranges. See section "Custom character classes" or my reply to your last question for details on how to express '-' using the proposed syntax.

If so, it should be noted more prominently;

OK. I can add a remark about this.

[...] As an example: how would a requirement for one of [, - or ] be expressed? I believe in JS regular expressions it would be [-[]]

No escaping is necessary to express '[': [[]. The third from the last sentence and last sentence of section "Custom character classes" explain how to express '-' and ']', respectively. Quoting the proposal:

To list '-' as a literal character it must appear immediately after the opening square bracket '['. The right square bracket (]) is also reserved as a special character. To list ']' as a literal character it must appear immediately before the closing square bracket ']'.

I think 90 bits is a reasonable minimum, but will happily defer to real cryptographers.

And thank you for addressing the rest of those questions! It might be worth including the [-[]] example as its syntax may be a bit surprising for someone familiar with regular expressions

90 bits of entropy is excessive. For the "repeatedly guess" threat model, a much lower number of bits will stop the attacker (so long as the website has reasonable rate limits and/or an attempt limit). Even 20 bits is reasonably effective for this case (though obviously not ideal). 20 bits is equivalent to a 6-digit numeric passcode.

For the "offline attack against leaked database" threat model, the number of bits needed depends on the quality of password hashing used by the website. I did the math on this a while ago based on fastest known password cracking and then assuming a few power of two speedups on top of that:

Strong (bcrypt, PBKDF, scrypt): ~47 bits needed
Decent (SHA512): ~49 bits needed
Poor (SHA1): ~66 bits needed
Terrible (NTLM, DES CRYPT, MD5): just give up

So it's probably not right to have a hard limit significantly higher than 47 bits. Note that if the site allows entropy somewhat below the limit, it's probably still better to know their password rules and make a generated password, instead of ignoring them and forcing the user to make a manual one.

As an extra safety margin, Safari tries to generate passwords with >70 bits of entropy, but we would still want to generate something on sites that won't allow our full format.

Also, entropy calculations are nontrivial, especially in the presence of multiple required character classes. We can't just make it a vague requirement without including the calculation algorithm. Based on this I don't think we should have a direct entropy limit at all. Instead, we should have limitations that are more readily checkable.

90 bits is assuming "just give up"-quality hashing (unfortunately still widely used), an offline attack, a well-funded attacker, and cheap hardware on a large scale (e.g. botnet or dedicated cryptomining-style hardware farms)

90 bits of entropy is not enough to defend against a parallelized offline attack against a leaked database that uses garbage-tier hashing. I don't think there is any practical number of bits that works for those cases. They can be computed very quickly on commodity hardware and most have practical preimage attacks (not quite yet for MD5 but it's close). In my belief it is not possible to defend against an offline database attack for a site that uses very weak hashing no matter how strong your password is. In practice all you an do is make sure to not reuse passwords between sites, which password generators facilitate.

Requiring 90 bits is also more than most password generators will use, and more than many sites with mildly silly restrictions can support. So setting that as the floor would make this feature useless, and will result in failed password generation (and therefore human-generated passwords) on most sites.

Note that this feature is a harm reduction feature (reducing the collateral damage of dumb password restrictions by still letting password generators do the best they. can) not a best practices feature. It's best if password generators can work on as many sites as possible, even if some sites are individually not defensible due to bad hashes or excessive password limitations.

(it also builds in a significant safety margin to account for so-far-unknown structural flaws, as seen e.g. in triple DES encryption, and to make online attacks with malicious code executing on the same CPU sharing cache/speculative execution byproducts more expensive)

Depends on which garbage-tier hash :)

Overall, though, I actually agree. Any number chosen is a compromise and 70 bits seems like a very reasonable one to me

Apple is releasing passwordrules attribute in Safari 12 as feature "Automatic Strong Passwords", see
https://webkit.org/blog/8327/safari-technology-preview-58-with-safari-12-features-is-now-available/
https://developer.apple.com/password-rules/

So there's now a PR for this issue. It's unclear to me we have more implementers interested than Safari at this point though. It's also not really clear to me if we reached agreement that if we added this, whether we should recommend against using it except for sites that have legacy backends and such. I'd appreciate help.

cc @mnoorenberghe

I do want to say that even if the feature doesn't accumulate enough multi-implementer interest to land, I'm really happy to see the PR. Full-fledged spec PRs (and tests) are a great way of concretizing a proposal and making it easier for that multi-implementer interest to appear later, even if they end up hanging for a while. (We have many such awaiting-interest PRs.)

It looks like Chrome has a built-in password generator now too. Does anyone know of a good contact person for this feature? (I figure browsers that feature password generation are more likely to be interested in this feature).

I can't find any indication of Firefox or IE having built-in password generation.

Would interest from implementors of add-ons or extensions that feature password generation be relevant for this feature?

It'd be good to know for sure, but per https://whatwg.org/working-mode#changes and how we generally talk about implementers, we'd need two browsers on board as well.

I'd be open to revisiting that sort of thing though, in some way. We had similar discussions about #3870; see #2945 (comment) and following comments. Maybe a good discussion for whatwg/meta (or whatwg/sg?).

I'll try to ask around inside Chrome to find the appropriate folks.

Dominic from Chrome's password manager team here.

I took a look at the proposed specification and it seems to be generally sound. A few thoughts:

  1. I think it is a good idea to use a very simple language rather than one that tries to cover every possible corner case of password requirements imaginable (i.e. no 'it cannot look like a birthday').
  2. I have some concerns that if these passwordrules are only intended for password generation but not for validation of user-input, it will be abused to deceive password managers. But note: If it became used for input validation, a default of "allowed: ascii-printable" would be problematic of course because it would exclude non-ascii alphabets.
  3. Unfortunately, I don't expect a lot of impact by this spec.

Looking at password fields from sign-in forms and sign-up forms in the wild (a sample of a mix of very popular sites and the long tail), I see the following statistics for the current use of autocomplete attributes (not weighted by visits, every site counts equally):

Sign-in forms:

  • empty: 66.9%
  • off: 28.8
  • current-password: 1.5%
  • on: 1.1%
  • new-password: 1.1%
  • other strings ("nope", "nothing", "foo", ...): 0.8%

Sign-up forms:

  • empty: 96.1%
  • off: 3.1%
  • new-password: 0.5%
  • current-password: 0.0% (but >0)
  • false: 0.0% (but >0)
  • on: 0.0 (but >0)

This means that the autocomplete attribute is used correctly for sign-in forms in 1.5% of cases and in a deceiving way in 1.1% (+30% to say 'off' in various ways) and correctly on sign-up forms in 0.5% of cases, to disable filling in ~3% and not used at all in the vast majority of cases.

With this I am currently not really convinced that this will have enough positive impact to implement it. In particular I expect that those sites that impose any password requirements won't use it.

Really great discussion above.

I've made a related proposal for a <input type=password strong>, which takes a different approach by validating password entropy algorithmically.

I work on 1Password's password generator.

I would like to offer a few comments on some of the discussion.

ascii-printable is the right default

@battre said

a default of "allowed: ascii-printable" would be problematic of course because it would exclude non-ascii alphabets.

I strongly advice against encouraging unicode in passwords until we can be confident that the site properly normalizes the unicode. There are (at least) three different byte sequences that a character like "Γ…" can come out as. If the system accepting the password doesn't systematically normalize the input, then a user can find themselves locked out merely by changing their keyboard or even an operating system upgrade.

Entropy calculation in the face of multiple required sets

Ensuring a uniform distribution and being able to compute the strength is a tricky problem. My colleagues and I presented a solution. See

Please see https://github.com/1Password/spg/blob/master/passwordscon/paper.pdf

We get some bits for free

When targeting particular strength, we should keep in mind that it is much more computationally expensive to test 2^70 passwords than it is to test 2^70 AES keys. Exactly how much more expensive is going to depend on how they are hashed, but even without any slow hashing, the process of generating guesses of passwords to try is going to be far slower than generating a key.

So where might want a 90 bit key, we can do with a 70 bit password (by some estimates).

With this, I am currently not really convinced that this will have enough positive impact to implement it. In particular I expect that those sites that impose any password requirements won't use it.

I also work for 1Password and completely agree with Dominic. Although I've also dreamed of having a spec like this, folks who would most likely implement it would be least likely to require any password rules (outside of perhaps minlength, which can already be added).

One small data point: hey.com has adopted passwordrules. (In fairness, these rules would probably be met automatically by any reasonable password generator.)

<input autocorrect="off" autocapitalize="off" required="required" passwordrules="required: lower; required: upper; required: digit; minlength: 12;" class="input input--full-width input--underlined align--center txt--x-large txt--bold txt--x-large@mobile" autofocus="autofocus" autocomplete="new-password" size="24" data-steps-target="input" aria-label="Password" type="password" name="sign_up[password]" id="sign_up_password">

Whether 70 bits of entropy is enough for a password is kind of dependent on what hash function is in use (for the password database breach threat model at least). If it's bcrypt or scrypt, then 70 bits is more than enough. If it's MD5, then probably not. For very weak hashes like legacy Unix crypt there is no feasible password length that would be safe.

I didn't know about this proposal prior, but I wrote a related sketch here: https://discourse.wicg.io/t/add-password-restriction-attributes-to-input-type-password/4767

It's mostly the same as this mod syntax, but it's more regexp-based and deconstructs it a bit more. Also, I'm not a fan of using literals only, as while alphanumeric characters are a pretty standard subset, support for special characters varies so wildly between sites in my experience I'm not convinced any named subset for that is worth anything. That and some places explicitly include various Unicode characters as valid. Some places are also "must contain characters in at least 3 of the following groups" and others have restricted words (like emails), and the current proposal as specified above doesn't encompass either.

I didn't add anything for consecutive character restrictions, though.

dcow commented

Dominic from Chrome's password manager team here.

...

Looking at password fields from sign-in forms and sign-up forms in the wild (a sample of a mix of very popular sites and the long tail), I see the following statistics for the current use of autocomplete attributes (not weighted by visits, every site counts equally):

...

This means that the autocomplete attribute is used correctly for sign-in forms in 1.5% of cases and in a deceiving way in 1.1% (+30% to say 'off' in various ways) and correctly on sign-up forms in 0.5% of cases, to disable filling in ~3% and not used at all in the vast majority of cases.

With this I am currently not really convinced that this will have enough positive impact to implement it. In particular I expect that those sites that impose any password requirements won't use it.

I'm sorry I just don't see the logic here: "we currently don't have anything for sites to use to specify password rules" -> "so let's look at the uses of the autocomplete attribute" -> "it's not used much" -> "therefore password-rules wouldn't be used either" -> "we should not implement this standard".

My thoughts:

  1. the majority of traffic is not to the long tail. I suspect if even the top 200 sites all implemented password-rules it would greatly benefit users because they'd be able to enjoy a much better password manager UX implying they might actually use one.
  2. sometimes just providing the right abstraction is enough to garner adoption.

It's about opportunity costs. Investing into this is a decision against investing into something else. With infinite resources, I agree that it would be nice to see this happen.

dcow commented

ascii-printable is the right default

@battre said

a default of "allowed: ascii-printable" would be problematic of course because it would exclude non-ascii alphabets.

I strongly advice against encouraging unicode in passwords until we can be confident that the site properly normalizes the unicode. There are (at least) three different byte sequences that a character like "Γ…" can come out as. If the system accepting the password doesn't systematically normalize the input, then a user can find themselves locked out merely by changing their keyboard or even an operating system upgrade.

Entropy calculation in the face of multiple required sets

Ensuring a uniform distribution and being able to compute the strength is a tricky problem. My colleagues and I presented a solution. See

Please see https://github.com/1Password/spg/blob/master/passwordscon/paper.pdf

Why not just consider the entropy of the password without the required characters and use that as a lower bound? Then just add on the required characters from there? It's not optimal, sure, but wouldn't it work?

For example, let's say we want 80 bits of entropy. Let's assume printable ascii. That's ~ 96^12, or 12 characters. Let's say most sites require 3 different types of characters. 15 character password should then have a lower bound of 80 bits of entropy if 12 of them are generated using a normal distribution? 9 characters, ~60 bits so 12 chars (pretty reasonable). Probably wouldn't want to go lower than 7 chars (~2^46) + 3, or 10 chars. So basically pick 10, 12, 15 and suggest one of those. Or suggest in the language a minimum number like 8 (that's 56 bits, a nice number) prior to any character class restrictions and recommend that the minimum should be 1 more for each character restriction added.

@battre Do you have more recent statistics about the use of autocomplete?

No, I don't have any metrics easily accessible.

I want to point out that separating websites into "modern" and "not modern" without talking about at least backend vs frontend is an overly simplistic approach to determine the impact of this proposal. There are plenty of situations where it is relatively easy to rewrite a frontend to be much more modern than the underlying backend. Particularly in my home country there's a ton of "lipstick-on-a-pig" development going on, where frontends are written in some JS framework from the past 5 years while the URLs hint at some absolutely arcane backend. So, you cannot conclude that a recently updated website (according to some stats) will not have any use for this new attribute. It may still serve up data from a backend that was written in the early 2000s and therefore have a need to expose password restrictions.

Additionally some companies find themselves in situations where lack of hashing is a feature. A recent example is a globally operating ISP I won't name who does not hash their customer's passwords, because customer support reps need to ask for passwords on the phone. You may disagree with the entire premise of that decision, but that ISP's website is as modern as it can be, as far as web standards go. Assuming you don't reject the premise, Password managers aren't a great fit for the user anyway though.

I am not sure about the opportunity cost of this proposal myself, but I think the way people this thread talk about usecases and stats passes value judgement on attribute usage too quickly.

That said, people are already implementing this proposal in both websites and password managers, so not standardizing some iteration of this would IMO be a mistake, regardless of opportunity cost.

That said, people are already implementing this proposal in both websites and password managers, so not standardizing some iteration of this would IMO be a mistake, regardless of opportunity cost.

It's implemented in Safari as of Safari 12. Apple even provides a generation tool. It's not in WebKit, just in Safari.

Checking usage in httparchive: currently 22 pages use a passwordrules attribute on input (out of the total 12,822,310 pages, so ~0.0002% of pages):

https://my.tvnow.de/
https://www.instrumentl.com/
https://adverteren.autotelexpro.nl/
https://auth.digidentity.eu/
https://my.tvnow.at/
https://poblano.jp/
https://www.markets.com/
https://lernwelt.drv-bund.de/
https://my.tvnow.ch/
https://punterplay.com/
https://tool.studyforge.net/
https://upcmail.upc.pl/
https://upcmail.upc.sk/
https://app.propstreet.com/
https://downloads.datastax.com/
https://jedcoacademy.com/
https://najmacademy.com/
https://upcmail.upc.ie/
https://content.erosplatform.com/
https://login.k-auto.fi/
https://pmcardio.powerfulmedical.com/
https://www.lowesbenefitsplus.com/

Results with extracted markup snippets and page rank: https://docs.google.com/spreadsheets/d/1O3ogKWl9Jkm98BC8OdQLK6AXnEsiPbIN-EEOZcl7xvE/edit?usp=sharing

query
SELECT
  *
FROM (
  SELECT
    rb.page AS page,
    rb.url AS url,
    sp.rank AS rank,
    REGEXP_EXTRACT(body, r'(?i)(<input\s+[^>]+\bpasswordrules\s*=[^>]+>)') AS match
  FROM
    `httparchive.response_bodies.2023_06_01_desktop` AS rb # TABLESAMPLE SYSTEM (1 PERCENT)
  JOIN
    `httparchive.summary_pages.2023_06_01_desktop` AS sp
  ON
    (rb.page = sp.url) )
WHERE
  match IS NOT NULL
ORDER BY rank, page

Also 19 instances in GitHub code search

It seems that so far this feature has very low adoption rate despite being supported in Safari and 1Password for 5 years. Maybe it would be more rapidly adopted if implemented in all browsers and password managers, though.

@zcorpan

It seems that so far this feature has very low adoption rate despite being supported in Safari and 1Password for 5 years. Maybe it would be more rapidly adopted if implemented in all browsers and password managers, though.

Doesn't help that it's gotten almost no media or mainstream tech blog coverage, either. web.dev has no reference to it anywhere, not even in their sign-in best practices page. This reddit post, linking to this blog post about the attribute, is 2 years old and only has a score of 14 and 8 comments.

Also, Mozilla explicitly objects to the proposal's premise: https://mozilla.github.io/standards-positions/#passwordrules-attribute

We believe this proposal, as drafted, encourages bad practices around passwords without encouraging good practices (such as minimum password length), and further has ambiguous and conflicting overlap with existing input validity attributes. We believe the existing input validity attributes and API are sufficient for expressing password requirements.

And personally, having taken a step back from this for so long, I don't feel they're wrong. If you're forward-thinking enough with security to think to use this new attribute to help people, you're also likely forward-thinking enough to know why limiting password characters is a bad idea. You also may even know that passphrases are more secure, despite the smaller source alphabet. So honestly, I find it unlikely this will ever take off even if it did gain wider awareness and publicity, not in 2 years, not in 5, not in 10.

@dead-claudia

limiting password characters is a bad idea

I don't really understand what you're saying here.
I subscribed to this ticket because I was looking for a way to enforce

  • a minimum characters length
  • using at least a lowercase, an uppercase, a number and a special character

I don't think it's limiting password characters, it's actually the opposite; forcing to use a wide range of character types.

Note that from the official Chromium's documentation, it's supposed to take into account attributes like maxlength, pattern and more

PasswordGenerationManager takes messages from the renderer and makes an OS specific dropdown. This UI use a PasswordGenerator to create a reasonable password for this site (tries to take in account maxlength attribute, pattern attribute, etc.). If the password is accepted, it is sent back to the renderer.

(Notice the important etc. here)

What happens in reality is that

  • Chromium is completely ignoring the pattern and minLength attributes. The only working attribute is maxLength which is a bit useless since forcing a maximum length isn't in any way helping to create a strong password (ok it's reasonable to enforce the password not to be more than 1000 chars long for example)
  • Firefox is also completely ignoring these attributes

The passwordrules attribute seemed like a very good idea, too bad only Safari supports it and also too bad other browser developers like Chromium don't seem to care guiding people to create strong passwords.

By the way, I personally use KeePass to create completely random, 32 chars long passwords with any kind of characters so my passwords are strong enough. No need to use passphrases.

@jmevel Read that last paragraph a little more closely. I wasn't explaining why password limitations were bad. I was explaining why the attribute would never get used in practice. And the supporting point wasn't directly that they're bad, but knowledge of them being bad resulting in the server being designed without much in terms of password restrictions other than maybe the obvious of minimum search space, not being one of the most used passwords, a minimum of 8-12 characters, and an unrealistic max of like 64+ characters just for the sake of security. If the server accepts anything, the password manager can generate anything, and no reasonable password manager defaults to anything less than 12 characters anyways (so it doesn't matter).

As for the problem sites this attribute is even being suggested for, they are also almost universally at best half-maintained. They aren't likely adding these attributes anytime soon. If you're lucky, they might use minlength and maxlength in their forms because those pre-date HTML5. It'd take about as long to get them to update their backend systems to remove the unnecessary restrictions (preferred) as it would take to get them to add this proposed attribute.