cpp-netlib/uri

Comparing a URI containing a percent-encoded query triggers assert

Closed this issue · 1 comments

Comparing two nonempty URI instances triggers an assert if at least one of the URI instances contains at least one percent-encoded character in its query.

The following program demonstrates the problem.

#include <iostream>
#include <network/uri.hpp>

int main()
{
  const auto u = network::uri_builder()
    .scheme("http")
    .host("example.com")
    .query("foo", "bar\\baz")
    .uri();

  const auto v = network::uri_builder()
    .scheme("http")
    .host("example.com")
    .uri();

  std::cout << u.string() << '\n';

  const auto p = (u == v);
  std::cout << p << '\n';
}

Program output:

http://example.com?foo=bar\baz
Assertion failed: is_valid, file C:\Source\uri\src\uri.cpp, line 401

Whereas, the program behaves correctly by changing the bar\\baz query value to bar/baz. Likewise, the program fails for any query string that contains at least one percent-encoded character.

Looks like the underlying problem is due to buggy normalization. The following program triggers the same assertion.

int main() {
  network::uri u("http://www.example.com?foo=%5cbar");
  u.normalize(network::uri_comparison_level::syntax_based);
}

The normalize function indiscriminately decodes all percent-encoded characters, then parses the resulting string as a URI. When the parser arrives at the decoded backslash character in the query (\ decoded from %5C), the parser recognizes the character as being an invalid query character and consequently stops parsing, returning a not valid status.