sindresorhus/normalize-url

Regex for removal of duplicate slashes not preceded by a protocol is too strict

Closed this issue · 0 comments

gcox commented
// Remove duplicate slashes if not preceded by a protocol
if (urlObj.pathname) {
  urlObj.pathname = urlObj.pathname.replace(/(?<!https?:)\/{2,}/g, '/');
}

Limiting the preceding protocol to only http or https is too strict. There are valid URLs that contain other protocols (ftp, s3, git, etc) as part of their path that are rendered invalid by this regex.

Example URLs broken by this regex:

  • http://sindresorhus.com/s3://sindresorhus.com becomes http://sindresorhus.com/s3:/sindresorhus.com
  • http://sindresorhus.com/git://sindresorhus.com becomes http://sindresorhus.com/git:/sindresorhus.com

Real world URL broken by this regex: