Section 2.2.3 Special Characters contains two examples about path matching for paths containing the special characters * and $. The two characters are percent-encoded in the allow/disallow rule but not encoded in the URL/URI to be matched. Looks like the robots.txt parser and matcher does not follow the examples in the RFC here and fails to match the percent-encoded characters in the rule with the unencoded ones in the URI. See the unit test below.

* and $ are among the reserved characters in URIs (RFC 3986, section 2.2) and therefor cannot be percent-encoded without potentially changing the semantics of the URI.

diff --git a/ b/
index 35853de..3a37813 100644
--- a/
+++ b/
@@ -492,6 +492,19 @@ TEST(RobotsUnittest, ID_SpecialCharacters) {
         IsUserAgentAllowed(robotstxt, "FooBot", ""));
+  {
+    const absl::string_view robotstxt =
+        "User-agent: FooBot\n"
+        "Disallow: /path/file-with-a-%2A.html\n"
+        "Disallow: /path/foo-%24\n"
+        "Allow: /\n";
+        IsUserAgentAllowed(robotstxt, "FooBot",
+                           "*.html"));
+        IsUserAgentAllowed(robotstxt, "FooBot",
+                           "$"));
+  }
 // Google-specific: "index.html" (and only that) at the end of a pattern is