more robust loading of utf-8 files

Question

more robust loading of utf-8 files

GoogleCodeExporter opened this issue 9 years ago · 2 comments

GoogleCodeExporter commented 9 years ago

Thanks for a really great software!

I had some trouble loading Hebrew utf-8 data in CoNLL2006 format -- it got
displayed as garbage.

This was solved with a one line change: 
line 109 in io/TabFormat.java, replace:
BufferedReader reader = new BufferedReader(new FileReader(file));
with:
BufferedReader reader = new BufferedReader(new InputStreamReader(new
FileInputStream(file),"UTF-8"));

It would be great if this could be integrated into the following versions..

Yoav

Original issue reported on code.google.com by yoav.gol...@gmail.com on 13 May 2009 at 10:46

Answer 1 · 2015-09-28T14:17:53.000Z

Hi Yoav, 

yes I actually had the same problem, and I fixed it in the trunk but just 
didn't get around to release a new 
version. I should do this now:) In any case, thanks for your report and patch! 

Sebastian

Original comment by sebastian.riedel@gmail.com on 13 May 2009 at 10:54

Answer 2 · 2015-09-28T14:17:53.000Z

Fixed in 0.2.2

Original comment by sebastian.riedel@gmail.com on 14 May 2009 at 12:04

Changed state: Fixed