it does not generate Header fields(From, To, Subject etc.) properly.
fatihkaymak opened this issue · 3 comments
- Version of extract_msg: [0.41.2]
- Your python version: Python [3.10]
- How did you launch extract_msg?
- I used the extract_msg package
Describe the bug
It read the msg file and generated MSG object. But header info (From, To, Subject etc.) was not proper. The msg content has Turkish characters; and these are problematic.
[ If applicable ]
**What code did you use or can we use to reproduce this error?
with extract_msg.openMsg('data/sample-mail.msg', overrideEncoding='utf-8') as msg:
html = msg.htmlBody.decode("utf-8")
print(msg.htmlInjectableHeader)
Is there a message.msg file you want to share to help us reproduce this?
- Uploaded message (drag and drop on this window)
sample-mail.zip
Traceback
[Put your traceback here]
Screenshots
[Insert any screenshots or debug pictures here]
Additional context
It generated this header:
From:
=?iso-8859-9?Q?Fatih_Kaymak_=28M=FC=FEteri_ve_Sat=FD=FE_Teknolojileri_B?=
=?iso-8859-9?B?9mz8bfwp?= <Fatih.Kaymak@akbank.com>
Sent: Wed, 07 Jun 2023 15:41:30 +0300
To: =?iso-8859-9?Q?Fatih_Kaymak_=28M=FC=FEteri_ve_Sat=FD=FE_Teknolojileri_B?= =?iso-8859-9?B?9mz8bfwp?= <Fatih.Kaymak@akbank.com>
Subject: extract-msg sampla mail
Looks like the email module isn't parsing the header right for that part. Didn't think that would be an issue, so now I have to figure out how to decode it.
Edit: Ah I see now. Looks like the way the module parses the header doesn't deal with those encoding strings, but email.header.decode_header can be used with some code to give the corrected value. I'll try to get out a fix for this within the next week or so.
This issue is now fixed, the fields should look correct now (and do look correct on the email I tested against)
Thank you.