ericcornelissen/wordrow

The regexp used to match queries in the target text is vulnerable to attacks

ericcornelissen opened this issue · 1 comments

Bug Report

  • wordrow version: v0.4.0-beta
  • Operating system: n/a

Description

The regular expression used to find instances of the from value of a Mapping is vulnerable to attacks from within a mapping file (or, theoretically an inline mapping using --map). Under the initial assumption that users define there own mapping this is merely a nuisance if they accidentally define an illegal mapping. In a worse scenario, where the mapping is not defined by the user themselves, the results may be more severe.

Actual Behaviour

  1. The program panics because the regular expression becomes invalid due to the defined mapping.
  2. The program takes a long time to execute because the regular expression becomes incredibly complex due to the defined mapping.
  3. The regular expression is used in a complex way to construct a mapping, even though mappings are not indented to support regular expression syntax.

Expected Behaviour

  1. The program does not crash or hang due to any user defined mappings.
  2. Regular expression syntax in a user-defined mapping is ignored.

Working Example

From initial fuzzing I found that the following kinds of inputs may cause problems related to the regular expression:

  1. (Found in c9d3550) A user-defined mapping contains characters that are not part of the UTF-8 charset. Since the documentation for the regexp package specifies that "All characters are UTF-8-encoded code points.". As a result wordrow panics.
  2. (Found in c656751) A user-defined mapping containing any character that means something in a regular expression in an invalid way. This causes the regular expression to become invalid and wordrow panics. This may be solved using the QuoteMeta function.

Log Output

Omitted for brevity.

Regarding point 2: this does not hold only for closing parenthesis, but in fact for all special characters in Go regular expression. Luckily, Go's regexp package provides a function called QuoteMeta that automatically escapes all problematic characters in a string 😃

EDITs:

  1. to see/test this in action see 9dd9373
  2. updated the original posts description according to the details in this comment.

Note: this function does not solve problems with regards to point 1!