techsneeze/dmarcts-report-parser

Support of .gz files

Closed this issue · 11 comments

laerm commented

Hi,

First of all, thanks for the script. Nice to have all my reports in a DB. I finally got round to adding a DMARC entry in my DNS and have started getting reports. One issue I have is about 50% of them are not .zip files but .gz, which isn't supported by the script. Any chance we could see this in the next version?

Some of the domains that send .gz are:

  • fastmail.com
  • iledefrance.fr
  • chu-dijon.fr
  • linkedin.com
  • qq.com
  • smtp99.wallonie.be
  • comcast.net
  • ...
laerm commented

Sorry, just had a quick look at the code and it should be supported. I was running into the problem with your previous rddmarc-ts script. Sorry for the noise!

Still an issue when I specify a path full of compressed xml reports though:

There are 1 messages to be processed.
--------------------------------
The Current Message is: /tmp/dmarc/yahoo.com!unil.ch!1459900800!1459987199.xml.zip
--------------------------------
Subject: MimeType: text/plain
Could not find an embedded ZIP in </tmp/dmarc/yahoo.com!unil.ch!1459900800!1459987199.xml.zip>. Skipped.

Installed all the mentioned dependencies but still something missing somewhere.

I'd imagine it is somewhat related to #17 . If you unzip that file, is the script able to process it?

laerm commented

I get the same error message, whether the file is zipped or not:

$ ./dmarcts-report-parser.pl -d /tmp/dmarc/
There are 1 messages to be processed.
--------------------------------
The Current Message is: /tmp/dmarc/
--------------------------------
Subject: MimeType: text/plain
Could not find an embedded ZIP in </tmp/dmarc/>. Skipped.

Tried removing special characters in the file name (exclamation marks, extra full stop) but no dice.

Your output shows you pointed it to the folder. Does it happen if you explicitly call the extracted file as well? You'll notice in the output I shared in #17 I pointed it at the file directly.

laerm commented

Same error if I point it to an unzipped file:

./dmarcts-report-parser.pl -d /tmp/dmarc/There are 1 messages to be processed.
--------------------------------
The Current Message is: /tmp/dmarc/
--------------------------------
Subject: MimeType: text/plain
Could not find an embedded ZIP in </tmp/dmarc/>. Skipped.

That output shows you're only pointing it at the directory. Can you try:

./dmarcts-report-parser.pl -d /tmp/dmarc/some message file.eml

Thanks!

laerm commented

Same error:

./dmarcts-report-parser.pl -d /tmp/dmarc/ComUEunilch01459977922.xml 
There are 1 messages to be processed.
--------------------------------
The Current Message is: /tmp/dmarc/ComUEunilch01459977922.xml
--------------------------------
Subject: MimeType: text/plain
Could not find an embedded ZIP in </tmp/dmarc/ComUEunilch01459977922.xml>. Skipped.

Starting to suspect it might be a missing perl package, or the version i have installed behaving differently. I've attached the output of _perldoc perllocal _with the installed perl modules:
installed_cpan.txt

laerm commented

Ah, I think you meant the .eml file, yes, that works!

./dmarcts-report-parser.pl -d /tmp/dmarc/test.eml 
There are 1 messages to be processed.
--------------------------------
The Current Message is: /tmp/dmarc/test.eml
--------------------------------
Subject: Report Domain: unil.ch Submitter: ComUE Report-ID: unil.ch-1459977903@ComUE
MimeType: multipart/mixed
This is a multipart attachment 
Skipped an unknown attachment 
/tmp/msg-3758-2.zip
body is in /tmp/msg-3758-2.zip
Already have ComUE unil.ch:1459977903, skipped
laerm commented

Dumping the $ent variable, I get the following when using the IMAP feature:

The Current Message UID is: 63
--------------------------------
Subject: MimeType: text/plain
$VAR1 = bless( {
                 'ME_Bodyhandle' => bless( {
                                             'MB_Path' => '/tmp/msg-11149-31.txt'
                                           }, 'MIME::Body::File' ),
                 'mail_inet_head' => bless( {
                                              'mail_hdr_mail_from' => 'KEEP',
                                              'mail_hdr_modify' => 0,
                                              'mail_hdr_foldlen' => 79,
                                              'mail_hdr_hash' => {},
                                              'mail_hdr_lengths' => {},
                                              'mail_hdr_list' => []
                                            }, 'MIME::Head' ),
                 'ME_Parts' => []
               }, 'MIME::Entity' );
Could not find an embedded ZIP in <IMAP message with UID #63>. Skipped.
Moving (copy and delete) processed IMAP message file to IMAP folder: tmpdmarc_processed

IMAP user here. .gz is supported and working ok.
If you want to parse extracted .xml files, use the -x parameter

laerm commented

Using the fix provided by @key134 in #17 by adding

Password => $imappass,
                IgnoreSizeErrors => 1)
        # module uses eval, so we use $@ instead of $!

Fixes this issue