scambier/obsidian-text-extractor

[Feature request] Fix carriage returns

CaptainKludge opened this issue · 0 comments

Tesseract does not output carriage returns. After looking at an output file in a hex editor the reason is clear. Tesseract seems to determine line feeds prefectly fine but it only inserts the Line Feed character (0x0A) and not the carriage return character that a windows text file expects. (0x0D 0x0A)

So a better behavior would be to take 0x0D in an output string and replace the hex found with 0x0D 0x0A. Definitely increate usability.