michaelrsweet/htmldoc

Accented characters cause display issues in v1.9 this was OK in v1.8

Sparky10 opened this issue · 3 comments

When converting content that contains accented characters we find that we run into a number of display issues. This has started happening in v1.9 and was not an issue in v1.8.

Original HTML

col1 col2 şarap and şamaş 0.00 3

V1.8 conversion
image

V1.9 conversion
image

@Sparky10 What options are you passing on the command-line?

FWIW, when I use the default (ISO-8859-1) character set I see this problem but if I specify UTF-8 the right thing comes out:

htmldoc --webpage --charset utf-8 -f FILENAME.pdf FILENAME.html

Thanks Michael, we will check this out and I will update here with results.