cloudflare/py-mmdb-encoder

Corrupted file when a field has non-ascii characters

Opened this issue · 0 comments

When trying to create an mmdb with non-ascii characters, the file produced cannot be read. It's like the offsets are wrong..

I think it's because the offset written to file assume that the python string length is the same as the output bytes when a string is encoded to utf-8.

Setting the length from the encoded string seems to produce the correct result at https://github.com/cloudflare/py-mmdb-encoder/blob/master/mmdbencoder/__init__.py#L346

length = len(value.encode('utf-8'))