tnef_attachement.name is not unicode
guettli opened this issue · 9 comments
I get a UnicodeError in my application code, because tnef_attachement.name is not unicode.
Could the tnef library get updated, to make tnef_attachement.name a unicode string?
In my case it looks like a latin1 string. It would improve the usability of the library if the application programmer does not need to do string decoding.
Sorry, I can't post the tnef binary, since it is from a customer.
BTW, how can you create tnef test binaries?
Unfortunately I don't have the matching test here to proof if the patch solves this issue.
I trust you and I think this issue can be closed.
Should I close it?
I'm not in position to make that decision.
Please, if you can, provide a PR.
Note: master now strips null bytes from long_filename as well.
The encoding of tnef attachment (long) names is a bit tricky to get right, it's not exactly simple.
See for example the discussion at roundcube/roundcubemail#5646 .
If you have good understanding of how the encodings in tnef attachments work or any links to information, that would help.
Note: we should also consider how this is to work on Python2 vs. Python3 (with help of six?)
thank you!