Headers lost when recieving headers with same key
ElonDusk26 opened this issue · 2 comments
Expected behaviour
the following code sends a request to instagram and prints all of the recieved headers in to the console:
RestClient::Connection *conn = new RestClient::Connection("https://www.instagram.com");
auto r = conn->get("/");
for(auto itr = r.headers.begin(); itr != r.headers.end(); itr++)
{
std::cout << (*itr).first << " " << (*itr).second << std::endl;
}
outputs:
content-type text/html; charset=utf-8
date Wed, 12 Aug 2020 12:04:16 GMT
expires Sat, 01 Jan 2000 00:00:00 GMT
pragma no-cache
set-cookie mid=XzPawAAEAAGWW1ve4HIUz7lE8M5n; Domain=.instagram.com; expires=Sat, 10-Aug-2030 12:04:16 GMT; Max-Age=315360000; Path=/; Secure
strict-transport-security max-age=31536000
vary Cookie, Accept-Language, Accept-Encoding
x-aed 15
x-content-type-options nosniff
x-fb-trip-id 1679558926
x-frame-options SAMEORIGIN
x-ig-push-state c2
x-xss-protection 0
Where it should be:
x-content-type-options: nosniff
x-xss-protection: 0
x-ig-push-state: c2
x-aed: 15
access-control-expose-headers: X-IG-Set-WWW-Claim
set-cookie: sessionid=""; Domain=instagram.com; expires=Thu, 01-Jan-1970 00:00:00 GMT; Max-Age=0; Path=/
set-cookie: sessionid=""; Domain=.instagram.com; expires=Thu, 01-Jan-1970 00:00:00 GMT; Max-Age=0; Path=/
set-cookie: sessionid=""; Domain=i.instagram.com; expires=Thu, 01-Jan-1970 00:00:00 GMT; Max-Age=0; Path=/
set-cookie: sessionid=""; Domain=.i.instagram.com; expires=Thu, 01-Jan-1970 00:00:00 GMT; Max-Age=0; Path=/
set-cookie: sessionid=""; Domain=www.instagram.com; expires=Thu, 01-Jan-1970 00:00:00 GMT; Max-Age=0; Path=/
set-cookie: sessionid=""; Domain=.www.instagram.com; expires=Thu, 01-Jan-1970 00:00:00 GMT; Max-Age=0; Path=/
set-cookie: sessionid=""; expires=Thu, 01-Jan-1970 00:00:00 GMT; Max-Age=0; Path=/
set-cookie: csrftoken=gdHwKLzFimBTeV6Tu0QNdaVC8b7SQMYh; Domain=.instagram.com; expires=Wed, 11-Aug-2021 12:07:07 GMT; Max-Age=31449600; Path=/; Secure
set-cookie: rur=FTW; Domain=.instagram.com; HttpOnly; Path=/; Secure
set-cookie: urlgen="{\"130.---.---.---\": 9158}:1k5pX9:tg69jMyLMW_zC0Hzgj7dwBTtinQ"; Domain=.instagram.com; HttpOnly; Path=/; Secure
content-length: 11819
x-fb-trip-id: 1679558926
alt-svc: h3-29=":443"; ma=3600,h3-27=":443"; ma=3600
X-Firefox-Spdy: h2
Theres a difference in the headers but the point is that its deleting set-cookie headers instead of appending them with something like an id or index number.
Solution
Append headers with an id as to not overwrite existing headers.
EDIT:
After doing some digging i found that HeaderField is made of std::map<std::string, std::string> which is not optimal, should be changed to a multimap to support multiple values of same key
I agree with ElonDusk, that changing HeaderField from std::map to std::multimap would be a good idea. That std::multimap should use the same caseless sorting, that I have already proposed in #157
Unfortunately, changing to multimap is going to break code, that already uses this library, as std::map::at() and std::map::operator[], which are commonly used to search for header fields, have no counterparts in std::multimap. Instead, std::multimap::find() and std::multimap::equal_range() need to be used now, which makes code more lengthy. So this old code:
RestClient::Response& r = conn->get("/URL");
std::string& ctype = r.headers["Content-Type"];
std::string& clength = r.headers["Content-Length"];
now becomes something ugly like this:
RestClient::Response& r = conn->get("/URL");
std::string ctype, clength;
if(auto hdrit = r.headers.find("Content-Type"); hdrit != r.headers.end()) {
ctype = hdrit->second;
}
if(auto hdrit = r.headers.find("Content-Length"); hdrit != r.headers.end()) {
clength = hdrit->second;
}
On the other hand, the library user would now be able to iterate over all Cookies received:
for(auto [rbegin, rend] = r.headers.equal_range("Set-Cookie"); rbegin != rend; rbegin++) {
std::cout << "Cookie received: " << rbegin->second << std::endl;
}
Suggestion
- Add a new field fullheaders to RestClient::Info and RestClient::Response, which is a case-insensitive multimap.
- Replace RestClient::Info::headers and RestClient::Response::headers with a proxy class, that offers operator [] and the method at() with the old style behaviour (find the last match only) for compatibility for old programs.
- Add a convenience method RestClient::Response::GetHeader(const std::string& name), which returns the last header named name. It throws std::out_of_range, if no matching header can be found.
- Add a convenience method RestClient::Response::GetHeader(const std::string& name, const std::string& dflt), which returns the last header named name, or dflt, if no such header can be found.
Any further suggestions?
@ElonDusk26 and @KaiPetzke maybe the two of you could work together on that change?