strohne/Facepager

Wrong text encoding

lukasini opened this issue · 3 comments

I´m facing issue with encoding symbols š,č etc. when scrapping slovak pages. I guess this is limitation of SQLite that supports only ISO-8859-1. Also exporting to UTF8 change nothing. Do you have some workaround or better way to fix it?

Within application it is looking fine but inside database its encoded wrongly
1
2

Hi lukasini, I guess the encoding is correct but you need to handle it in your subsequent workflows. How do you process the data? When you export a csv file, does it come out with the right encoding?

For R, we have an experimental package that loads the database, maybe it helps you? See https://github.com/strohne/datavana/tree/main/facepager

I suggest you use the export button of Facepager, that handles the encoding :)

Yes correct, direct export to CSV is working. Thank you for support.