Exception reading message 2
krumok opened this issue · 9 comments
Hi, i get this error when reading a message:
System.FormatException: Input string was not in a correct format.
at System.Number.StringToNumber(String str, NumberStyles options, NumberBuffer& number, NumberFormatInfo info, Boolean parseDecimal)
at System.Number.ParseInt32(String s, NumberStyles style, NumberFormatInfo info)
at System.Int32.Parse(String s, IFormatProvider provider)
at OpenPop.Mime.Decode.EncodingFinder.FindEncoding(String characterSet)
at OpenPop.Mime.MessagePart.ParseBodyEncoding(String characterSet)
at OpenPop.Mime.MessagePart..ctor(Byte[] rawBody, MessageHeader headers)
at OpenPop.Mime.Message..ctor(Byte[] rawMessageContent, Boolean parseBody)
at OpenPop.Mime.Message..ctor(Byte[] rawMessageContent)
at OpenPop.Pop3.Pop3Client.GetMessage(Int32 messageNumber)
using GetMessage method
This happens on a specific message, on other messages works successfully.
Now i bypass this message with try catch to avoid blocking email pop but i cannot read the indicted message.
Can you define a fallback decoder that prints out the character set string and then post it here?
Something like:
EncodingFinder.FallbackDecoder = delegate(string characterSet)
{
Console.WriteLine(characterSet);
return null;
};
where i have to add the FallbackDecoder?
here is my code (simplified):
Dim Pop3C As New OpenPop.Pop3.Pop3Client
Dim TotMail, CurrMail
Dim IsConnected As Boolean
Try
Pop3C.Connect(cH, 110, False)
Pop3C.Authenticate(cU, cP)
IsConnected = True
Catch ex As Exception
IsConnected = False
End Try
If IsConnected And uscita <> "ERR-SRV" Then
Dim EmailIds As New List(Of Integer)
TotMail = Pop3C.GetMessageCount
Dim delmail = 0
Dim i = 1
CurrMail = 1
Dim cMessage As OpenPop.Mime.Message
logstr = logstr & "connesso pec " & IsPEC.ToString & " totmail " & TotMail & Chr(13) & Chr(10)
OpenPop.Mime.Decode.EncodingFinder.AddMapping("ISO8859-15", System.Text.Encoding.UTF8)
If TotMail > 0 Then
If TotMail > MaxEmails Then
uscita = "ERR-SRV RESULT:=8 "
Else
For i = 1 To CInt(6)
Try
cMessage = Pop3C.GetMessage(CurrMail)
delmail = parseMessage(cMessage, iddip, utente_nome, 0, IsPEC)
If delmail > 0 Then
If delmail = 1 Then Pop3C.DeleteMessage(CurrMail)
i = i - 1
End If
Catch ex As Exception
logstr = logstr & "ERRORE Lettura email #" & CurrMail & ": " & ex.ToString & Chr(13) & Chr(10)
End Try
CurrMail = CurrMail + 1
If CurrMail > TotMail Then Exit For
Next
uscita = "OK RESULT:=3 "
If delmail = 2 Then uscita = "OK-NONEW RESULT:=2 "
End If
Else
uscita = "OK-NONEW RESULT:=2 "
End If
Pop3C.Disconnect()
I get the error "System.FormatException: Input string was not in a correct format. ..."
on cMessage = Pop3C.GetMessage(CurrMail)
I'd put it in the same place that you've put the custom mapping.
ok i add a reference to FallbackDecoder but with this particular message it not call the delegate but throw an exception. I've tried with other messages and calls to delegate works correctly and i can see characterSet in console log.
I post you the source message extracted from webmail:
Return-Path: <--------------------------->
Delivered-To: --------------------
Received: from localhost (localhost [127.0.0.1])
by -------------------- (Postfix) with ESMTP id A586829A077
for <------------------->; Mon, 22 Jun 2015 08:30:15 +0200 (CEST)
X-Virus-Scanned: ------- AntiSpam System
X-Spam-Flag: YES
X-Spam-Score: 7.21
X-Spam-Level: *******
X-Spam-Status: Yes, score=7.21 tagged_above=3.9 required=5
tests=[DNS_FROM_AHBL_RHSBL=2.025, HTML_MESSAGE=0.001,
HTML_MIME_NO_HTML_TAG=1.052, MIME_HTML_ONLY=1.672, SPF_SOFTFAIL=0.654,
SUBJ_ALL_CAPS=1.806]
Received: from --------------------- ([127.0.0.1])
by localhost (--------------------- [127.0.0.1]) (amavisd-new, port 10024)
with LMTP id GEi-54BQYLjN for <--------------------->;
Mon, 22 Jun 2015 08:30:06 +0200 (CEST)
Received: from smtpcmd02102.aruba.it (smtpcmd02102.aruba.it [62.149.158.102])
by --------------------- (Postfix) with ESMTP id 72E5129A071
for <--------------------->; Mon, 22 Jun 2015 08:30:06 +0200 (CEST)
Received: from BTGS.COM ([62.10.178.126])
by smtpcmd02.ad.aruba.it with bizsmtp
id jJW11q01E2k0jgm01JW2cY; Mon, 22 Jun 2015 08:30:05 +0200
MIME-Version: 1.0
From: "Giulia" <--------------------->
Reply-To: ---------------------
To: ---------------------
Subject: [Spam: MED] BANDI PER I SETTORI INDUSTRIA, TURISMO, COMMERCIO,
ARTIGIANATO.
Content-Type: text/html; charset="windows-1252http-equivContent-Type"
Content-Transfer-Encoding: quoted-printable
X-Mailer: SendBlaster.1.5.5
Date: Mon, 22 Jun 2015 08:28:39 +0200
Message-ID: 22722539762401220130843@Tiscali
I finanziamenti agevolati per le le aziende e le PMI, rappresentano=
una risorsa fondamentale per il sostegno degli investimenti per la cre=
scita e il consolidamento. Clicca qui.
Consulta il Portale Italiano.
Per cancellarti dalle news sul =
sito.
I've obscured sensitive data
If you take a look at the charset, it's "windows-1252http-equivContent-Type", which should just be "windows-1252". Add another custom mapping, like:
OpenPop.Mime.Decode.EncodingFinder.AddMapping("windows-1252http-equivContent-Type", System.Text.Encoding.GetEncoding(1252))
ok thank you
with the custom mapping i solve on that message.
Is it something that would be fixed in future release?
This is not something that is easily solvable, as the email you received is clearly invalid.
It would be possible to have this case handled by OpenPop by automatically stripping non-numerical characters from the string if it starts "windows-" or "cp-".
I don't have access to a C~ dev env at the minute, so you're welcome to submit a PR.
Having said that, it's a very specific case, with the charset obviously wrong so it may be better leaving it to throw an exception.
The way that MimeKit handles this is to:
- avoid the use of int.Parse() and instead use int.TryParse(): https://github.com/jstedfast/MimeKit/blob/master/MimeKit/Utils/CharsetUtils.cs#L218
- if that fails (in this case, it would), then it returns a codepage of
-1
- when the codepage is
-1
, attempt to convert using UTF-8, followed by the user's default charset, followed by ISO-8859-1: https://github.com/jstedfast/MimeKit/blob/master/MimeKit/Utils/CharsetUtils.cs#L451 - since the user's default charset can be overridden on the
ParserOptions
, it's possible for the user to override it at parse time (but defaults toEncoding.Default
). - after the message is parsed, since each
Header
has aGetValue()
method allowing you to specify a fallback charset to use, you can once again override it if the user decides that the text doesn't look right and wants to try another charset encoding - and no need to re-parse the entire message again, simply re-decode the individual header value(s) that the user wants.
MimeKit follows the philosophy that exceptions parsing messages should be avoided if possible and a sane fallback should be taken, but allowing the user to get access to the raw data and "re-try" if things didn't parse exactly right.
That said, @foens is correct that this particular message is invalid and trying to get the correct charset encoding out of that string is probably not worth the trouble. While in this particular case, stripping off text after the last numeric character might work, there are other charsets such as "iso-2022-jp" where you can obviously not do that and expect things to work.