webtorrent/node-bencode

Can't get exact binary key values

deoxxa opened this issue · 8 comments

This is the content of http://open.nyaatorrents.info:6544/scrape?info_hash=%91%ee%82%12%97%98%57%a5%f4%06%9e%c7%b2%8b%eb%47%cf%c1%db%f1 right now. It's the scrape output from nyaatorrents for the torrent with the hash 91ee8212979857a5f4069ec7b28beb47cfc1dbf1.

ZDU6ZmlsZXNkMjA6ke6CEpeYV6X0Bp7HsovrR8/B2/FkODpjb21wbGV0ZWkyZTEwOmRvd25sb2FkZWRpMTBlMTA6aW5jb21wbGV0ZWkyNzVlZWVl

Here it is in a script:

var bencode = require("./");

var data = Buffer("ZDU6ZmlsZXNkMjA6ke6CEpeYV6X0Bp7HsovrR8/B2/FkODpjb21wbGV0ZWkyZTEwOmRvd25sb2FkZWRpMTBlMTA6aW5jb21wbGV0ZWkyNzVlZWVl", "base64");

var obj = bencode.decode(data);

console.log(obj);

console.log(Object.keys(obj.files)[0]);

console.log(Buffer(object.keys(obj.files)[0]).toString("hex"));

As you'd expect, the last console.log outputs gibberish. This is due to JavaScript strings not playing too nicely with binary data. The only way around this is to avoid storing objects as objects (and thus keys as strings), or to encode the keys differently, as far as I can see.

Oh okay, that problem. Yep, we know about that one and nope, not gonna fix it since it would break our API quite heavily. Sorry for the missunderstandings. I don't think we can fix this nicely in a sync module, streaming should be possible thou, @jhermsmeier is on that topic.

I think you might be able to, by adding some methods to the object's
prototype. Specifically something like binaryKeys() and
binaryGet(buffer) with a map from buffers to strings. As long as there
are no collisions, it'd work alright. I think.

On Sunday, April 27, 2014, Mark Schmale notifications@github.com wrote:

Oh okay, that problem. Yep, we know about that one and nope, not gonna fix
it since it would break our API quite heavily. Sorry for the
missunderstandings. I don't think we can fix this nicely in a sync module,
streaming should be possible thou, @jhermsmeierhttps://github.com/jhermsmeieris on that topic.


Reply to this email directly or view it on GitHubhttps://github.com//issues/23#issuecomment-41495590
.

Honestly, I think we should get this fixed. If binary keys don't work, it's broken.
And if fixing it means breaking the API, I'm totally fine with that. (Well, we could just go with .toString('hex') for bin keys, but thinking about it already feels wrong).

Sounds like we'd have to implement our own form of a map data structure to get that working properly, as @deoxxa said – but I'm not fond of having to .get( key ) – it also would break iterating over it etc.

OK, now that I have even confused myself, any thoughts?

EDIT: Regarding the streaming en/decoders, @themasch: I'd like to keep the in- and outputs of both the sync and streaming API consistent, so supporting it in one, but not the other seems a bit broken as well.

Given that binary keys are kind of a special case, and that being able to use objects like objects is really nice, I'd be inclined to agree with Mark here about not breaking the current API. That doesn't, however, preclude adding additional (non-iterable, private) methods/properties to the objects to support the use-case of interacting with an object via its original binary keys. I might hack up a prototype today and see how it goes.

Ah, okay. So you're thinking about something like this?

// Accessor methods for both own props and bin keys
object.get( key )
object.set( key )
// Array of all keys (incl. own props)
object.keys()
// Array of all bin keys
object.binaryKeys()

Something like that, yeah. That way people will be able to do things like this:

var obj = bencode.decode(scrape_response);

var key = Buffer("91ee8212979857a5f4069ec7b28beb47cfc1dbf1", "hex");

console.log(obj.files.get(key));

I think that's pretty unobtrusive; if people never need to use the binary interface, they don't even need to know it exists.

Okay, that sounds cool. Doesn't break old usage but offers new features. Great guys!

going to be fixed by #24