mrtazz/restclient-cpp

Headers lost when recieving headers with same key

ElonDusk26 opened this issue · 2 comments

Expected behaviour

the following code sends a request to instagram and prints all of the recieved headers in to the console:

    RestClient::Connection *conn = new RestClient::Connection("https://www.instagram.com");

    auto r = conn->get("/");

    for(auto itr = r.headers.begin(); itr != r.headers.end(); itr++)
    {
        std::cout << (*itr).first << " " << (*itr).second << std::endl;
    }

outputs:

content-type text/html; charset=utf-8
date Wed, 12 Aug 2020 12:04:16 GMT
expires Sat, 01 Jan 2000 00:00:00 GMT
pragma no-cache
set-cookie mid=XzPawAAEAAGWW1ve4HIUz7lE8M5n; Domain=.instagram.com; expires=Sat, 10-Aug-2030 12:04:16 GMT; Max-Age=315360000; Path=/; Secure
strict-transport-security max-age=31536000
vary Cookie, Accept-Language, Accept-Encoding
x-aed 15
x-content-type-options nosniff
x-fb-trip-id 1679558926
x-frame-options SAMEORIGIN
x-ig-push-state c2
x-xss-protection 0

Where it should be:

x-content-type-options: nosniff
x-xss-protection: 0
x-ig-push-state: c2
x-aed: 15
access-control-expose-headers: X-IG-Set-WWW-Claim
set-cookie: sessionid=""; Domain=instagram.com; expires=Thu, 01-Jan-1970 00:00:00 GMT; Max-Age=0; Path=/
set-cookie: sessionid=""; Domain=.instagram.com; expires=Thu, 01-Jan-1970 00:00:00 GMT; Max-Age=0; Path=/
set-cookie: sessionid=""; Domain=i.instagram.com; expires=Thu, 01-Jan-1970 00:00:00 GMT; Max-Age=0; Path=/
set-cookie: sessionid=""; Domain=.i.instagram.com; expires=Thu, 01-Jan-1970 00:00:00 GMT; Max-Age=0; Path=/
set-cookie: sessionid=""; Domain=www.instagram.com; expires=Thu, 01-Jan-1970 00:00:00 GMT; Max-Age=0; Path=/
set-cookie: sessionid=""; Domain=.www.instagram.com; expires=Thu, 01-Jan-1970 00:00:00 GMT; Max-Age=0; Path=/
set-cookie: sessionid=""; expires=Thu, 01-Jan-1970 00:00:00 GMT; Max-Age=0; Path=/
set-cookie: csrftoken=gdHwKLzFimBTeV6Tu0QNdaVC8b7SQMYh; Domain=.instagram.com; expires=Wed, 11-Aug-2021 12:07:07 GMT; Max-Age=31449600; Path=/; Secure
set-cookie: rur=FTW; Domain=.instagram.com; HttpOnly; Path=/; Secure
set-cookie: urlgen="{\"130.---.---.---\": 9158}:1k5pX9:tg69jMyLMW_zC0Hzgj7dwBTtinQ"; Domain=.instagram.com; HttpOnly; Path=/; Secure
content-length: 11819
x-fb-trip-id: 1679558926
alt-svc: h3-29=":443"; ma=3600,h3-27=":443"; ma=3600
X-Firefox-Spdy: h2

Theres a difference in the headers but the point is that its deleting set-cookie headers instead of appending them with something like an id or index number.

Solution

Append headers with an id as to not overwrite existing headers.

EDIT:

After doing some digging i found that HeaderField is made of std::map<std::string, std::string> which is not optimal, should be changed to a multimap to support multiple values of same key

I agree with ElonDusk, that changing HeaderField from std::map to std::multimap would be a good idea. That std::multimap should use the same caseless sorting, that I have already proposed in #157

Unfortunately, changing to multimap is going to break code, that already uses this library, as std::map::at() and std::map::operator[], which are commonly used to search for header fields, have no counterparts in std::multimap. Instead, std::multimap::find() and std::multimap::equal_range() need to be used now, which makes code more lengthy. So this old code:

RestClient::Response& r = conn->get("/URL");
std::string& ctype = r.headers["Content-Type"];
std::string& clength = r.headers["Content-Length"];

now becomes something ugly like this:

RestClient::Response& r = conn->get("/URL");
std::string ctype, clength;
if(auto hdrit = r.headers.find("Content-Type"); hdrit != r.headers.end()) {
    ctype = hdrit->second;
}
if(auto hdrit = r.headers.find("Content-Length"); hdrit != r.headers.end()) {
    clength = hdrit->second;
}

On the other hand, the library user would now be able to iterate over all Cookies received:

for(auto [rbegin, rend] = r.headers.equal_range("Set-Cookie"); rbegin != rend; rbegin++) {
    std::cout << "Cookie received: " << rbegin->second << std::endl;
}

Suggestion

  • Add a new field fullheaders to RestClient::Info and RestClient::Response, which is a case-insensitive multimap.
  • Replace RestClient::Info::headers and RestClient::Response::headers with a proxy class, that offers operator [] and the method at() with the old style behaviour (find the last match only) for compatibility for old programs.
  • Add a convenience method RestClient::Response::GetHeader(const std::string& name), which returns the last header named name. It throws std::out_of_range, if no matching header can be found.
  • Add a convenience method RestClient::Response::GetHeader(const std::string& name, const std::string& dflt), which returns the last header named name, or dflt, if no such header can be found.

Any further suggestions?

@ElonDusk26 and @KaiPetzke maybe the two of you could work together on that change?