Percent-decoding does not accept non-ASCII octets
Closed this issue · 0 comments
mtrenkmann commented
When building a network::uri
object using network::uri_builder
, query parameters that contain non-ASCII multibyte characters (e.g. UTF-8) are percent-encoded as expected. For example, http://example.com/q=법정동
becomes http://example.com/q=%EB%B2%95%EC%A0%95%EB%8F%99
.
However, the other way around, when applying network::uri::decode
to the encoded query parameter, a percent_decoding_error
exception is thrown. I think this behavior is incorrect. According to RFC 3986 section 2.5 percent-encoding and decoding work at octet-level and should be otherwise agnostic about character encodings.
Suggested fix in network/uri/detail/decode.hpp
:
- if (h0 >= '8') {
- // unable to decode characters outside the ASCII character set.
- throw percent_decoding_error(uri_error::conversion_failed);
- }
Unit tests for reproduction:
- Percent-encoding a UTF-8 query parameter works
- Percent-decoding a UTF-8 query parameter does not work
TEST(UriBuilderTest, PercentEncodingAcceptsNonAsciiOctets) {
const std::string decoded = u8"법정동";
const std::string encoded = "%EB%B2%95%EC%A0%95%EB%8F%99";
network::uri_builder ub(network::uri("http://example.com"));
ASSERT_NO_THROW(ub.append_query_key_value_pair("q", decoded));
const network::uri uri = ub.uri();
ASSERT_EQ(network::string_view(encoded), uri.query_begin()->second);
}
TEST(UriDecodeTest, PercentDecodingAcceptsNonAsciiOctets) {
const std::string decoded = u8"법정동";
const std::string encoded = "%EB%B2%95%EC%A0%95%EB%8F%99";
std::string output;
ASSERT_NO_THROW(network::uri::decode(encoded.begin(), encoded.end(),
std::back_inserter(output)));
ASSERT_EQ(decoded, output);
}
Output:
[==========] Running 2 tests from 2 test cases.
[----------] Global test environment set-up.
[----------] 1 test from UriBuilderTest
[ RUN ] UriBuilderTest.PercentEncodingAcceptsNonAsciiOctets
[ OK ] UriBuilderTest.PercentEncodingAcceptsNonAsciiOctets (0 ms)
[----------] 1 test from UriBuilderTest (1 ms total)
[----------] 1 test from UriDecodeTest
[ RUN ] UriDecodeTest.PercentDecodingAcceptsNonAsciiOctets
src/uri_test.cc:53: Failure
Expected: network::uri::decode(encoded.begin(), encoded.end(), std::back_inserter(output)) doesn't throw an exception.
Actual: it throws.
[ FAILED ] UriDecodeTest.PercentDecodingAcceptsNonAsciiOctets (0 ms)
[----------] 1 test from UriDecodeTest (0 ms total)
[----------] Global test environment tear-down
[==========] 2 tests from 2 test cases ran. (1 ms total)
[ PASSED ] 1 test.
[ FAILED ] 1 test, listed below:
[ FAILED ] UriDecodeTest.PercentDecodingAcceptsNonAsciiOctets
1 FAILED TEST