Output file encoding should be set to UTF-8
rxzhangGH opened this issue · 2 comments
rxzhangGH commented
Hi,
When using the bifixer command line tool, I noticed an issue with line 53 of bifixer.py
:
parser.add_argument('output', type=argparse.FileType('w'), default=sys.stdout, help="Fixed corpus")
Since no encoding is specified in the type, a platform-specific encoding will be used and that caused problems for me. I suggest changing the above to:
parser.add_argument('output', type=argparse.FileType('w', encoding='UTF-8'), default=sys.stdout, help="Fixed corpus")
ZJaume commented
Could you please post the error? The OS and Python would be helpful, also.
ZJaume commented
It's fixed anyway now.