Improve support for CJK characters
Closed this issue · 1 comments
ljvmiranda921 commented
Hi there,
Thanks for this tool. It has been really helpful!
I'm just having issues regarding Chinese/Japanese/Korean characters. Whenever they are encoded, they look garbled. I'm not sure if it's a UTF-8 parsing in the Rust or Python side.
Steps to reproduce
I've attached the sample input and output for this issue. Here's the minimum reproducible example
from qvd import qvd_reader
df = qvd_reader.read("sample_data.qvd")
df.to_csv("sample_data.csv")
Input
Download link (expires after 24h):
https://wormhole.app/wjll0#vX3tLuzrucIFtO-LaWRpgg
Output
sample_data.csv
SBentley commented
Hi thanks for raising the issue and providing a file to test. I've pushed a new version that should now support all UTF-8 strings.