creationix/http-parser-js

encoding problem?

Janpot opened this issue · 5 comments

try following code:

process.binding('http_parser').HTTPParser = require('http-parser-js').HTTPParser;
var request = require('request');
request('http://www.barkarbykiropraktik.com/sv-SE/kiropraktik/varför-får-man-värk--25236090', (err, res) => console.log(res.request.uri.href));
// prints: http://www.barkarbykiropraktik.com/sv-SE/kiropraktik/varför-får-man-värk--25236090

Seems to make the wrong request

I get the result you quoted when I do not use http-parser-js. With http-parser-js (at least on Windows), I get http://www.barkarbykiropraktik.com/sv-SE/kiropraktik/varfC6r-fC%r-man-vC$rk--25236090 (also seems wrong). Do you get the expected result when not using http-parser-js?

I think you should use encodeURI here, because, that's the reason this function even exists...

I get the expected result when not using the parser
node 4.4 on ubuntu and node 6.2 on OSX

https://github.com/creationix/http-parser-js/blob/master/http-parser.js#L163-L180

var HTTPParser = require('./http-parser-js/http-parser.js').HTTPParser;
HTTPParser.prototype.consumeLine = function () {
	var end = this.end,
		chunk = this.chunk;

	for (var i = this.offset; i < end; i++) {
		if (chunk[i] === 0x0a) { // \n
			var line = this.line + chunk.toString('utf8', this.offset, i);
			if (line.charAt(line.length - 1) === '\r') {
				line = line.substr(0, line.length - 1);
			}
			this.line = '';
			this.offset = i + 1;
			return line;
		}
	}

	//line split over multiple chunks
	this.line += chunk.toString('utf8', this.offset, this.end);
	this.offset = this.end;
};
process.binding('http_parser').HTTPParser = HTTPParser;

This breaks one test, but adds utf8 support which some sites actually use in their headers.

In our case we were making a request to a normal url, and it was actually responding back with a unencoded Location header... Google, Curl, and node-request properly handles this out of the box, but over here we are enforcing ascii.

Looks like this was fixed in #13 - may have to set HTTPParser.encoding = 'utf8'