Streaming mode?
Closed this issue · 3 comments
I need to detect the character set of potentially huge text files. Would it be possible to do this in a streaming manner somehow, or does the underlying library really need all the data at once?
If it's possible to do this, the ideal interface would IMO be a writable stream that you could keep feeding data into and then call detectCharset()
on once you think you've fed it enough data.
@mooz Ping?
I'm sorry for late response. I've added a method detectCharsetStream(stream, onDetectionFinish)
to the module, although I'm not confident whether the interface conforms to your opinion.
Here is a simple example.
var detector = require("node-icu-charset-detector");
var fs = require('fs');
var fileStream = fs.createReadStream('/usr/share/dict/british-english');
detector.detectCharsetStream(fileStream, function (charset) {
console.log("charset: " + charset);
});
Hmm, it looks like current detectCharsetStream()
is somewhat problematic because it consumes a given stream. Seeking for a good solution...