jmhodges/rchardet

Doesn't detect UTF-16 without BOM.

art-solopov opened this issue · 1 comments

I have this file. I created it by converting some UTF-8 from stdin to UTF-16LE with iconv (which didn't produce a BOM). When I try to detect the encoding with rchardet, I get this result:

>>> CharDet.detect File.read('text_without_bom.txt')
=> {"encoding"=>"ascii", "confidence"=>1.0}

What I'd like to see is the actual UTF-16LE encoding.

no idea how to fix this :D
try making a PR with a failing test-case to get things moving 🤷‍♂️