techsneeze/dmarcts-report-parser

Reports with correct extension but wrong mimetype

Opened this issue · 2 comments

I've been getting reports from mimecast dot org that end in .zip but have a gzip mimetype. The script skips their emails when gunzip fails to unpack the zip file. I sent a report to their abuse address since they don't have a monitored reply address to report problems, only a customer forum.

I came across this same issue on my fork of this - I've added in a check to account for this, as it's been shown that this problem can persist.

Added the following to what would be between lines 654 and 655 in this version (YMMV, some values are different between the two scripts):

      # check to see if the mimetype was incorrectly set to gzip *cough mimecast*
      if ($unzip eq "gzip") {
        printDebug("gzip failed, checking if incorrect mime type");
        # incorrect mimetype with incorrect file extension seems to manifest as a .dat
        # so let's check for that
        my $ext = '';
        if ($filename =~ /\.([^.]+)$/) {
          print "extension: $1" if $debug;
          $ext = $1;
        }
        else {
          warn "$scriptname: Could not find extension for (<$filename>)! \n";
        }

        # if it's a dat file, then run this process again, but with $isgzip set to 0
        if ($ext eq "dat") {
          printDebug("Trying again using unzip");
          $xml = getXMLFromZip($filename,0);

          # if it fails again, it'll warn, if not, it'll return the correct xml
          return $xml;
        }
      }

A bit of a kludge, but it does correctly ingest these files now. I'm still a bit mixed on doing it this way, as it might have some security implications. But it does fix the issue on our end.

It looks like Mimecast finally fixed their reports and now they now send a proper .xml.gz file.