pdf2htmlEX/pdf2htmlEX

Random characters are replaced with  character (U+E61F)

boasbakker opened this issue · 1 comments

When converting a pdf, some characters are replaced with the "" character. You won't see this visually, because the character is represented in the generated font. I added an example zip with the PDF and HTML.
biological-psychology_compress-pages-2.zip

Try using --decompose-ligature 1
This will not always work, but I suspect that it can resolve the problem in your case. Be careful however, since the ligature fl (one character) gets decomposed into "fl" (two characters), the corresponding glyphs might be missing in the font. There is a solution to that, but you will probably be alright. Ask here, if it doesn't work.