Code assumes UTF-8 encoding
Closed this issue · 0 comments
tepperly commented
The C code assumes that the arguments are in UTF-8 encoding.
$ irb
irb(main):001:0> a = "\xe8".force_encoding("iso8859-1")
=> "\xE8"
irb(main):002:0> b = a.encode("utf-8")
=> "è"
irb(main):003:0> JaroWinkler.distance(a, b)
NameError: uninitialized constant JaroWinkler
from (irb):3
from /usr/local/bin/irb:11:in `<main>'
irb(main):004:0> require 'jaro_winkler'
=> true
irb(main):005:0> JaroWinkler.distance(a, b)
=> 0.0
irb(main):006:0> a.encoding
=> #<Encoding:ISO-8859-1>
irb(main):007:0> b.encoding
=> #<Encoding:UTF-8>
irb(main):008:0>