/mboxzilla

Export / upload emails from Thunderbird mbox files to single eml files

Primary LanguageC++BSD 2-Clause "Simplified" LicenseBSD-2-Clause

mboxzilla

Export / upload emails from Thunderbird mbox files to single eml files

Feature summary

mboxzilla is a free sofware designed to :

  • extract emails from mbox file to single eml files
  • compact a mbox file by removing all emails marked as deleted as well as malformed
  • split a mbox file into smaller mbox files
  • upload extracted emails to a remote directory in a safe mode
  • apply the above tasks automatically for Mozilla Thunderbird
  • eml files can be compressed in gzip format
  • logging can be enabled
  • supported platforms : Windows, Linux
  • 2-Clause BSD License

Usage

Command line options

  mboxzilla [OPTION...] [CONFIG_FILE]

  -f, --file FILE             Input mbox file.
  -o, --output DIR            Output directory.
  -p, --path PATH             Base input path where mbox file is stored. When
                              is set then input mbox specified by 'f' option
                              is concatenated to output directory. If PATH is
                              a substring from input mbox file then it is
                              from string following the path argument.
  -e, --extract               Extract emails from mbox file to eml files
                              named 'YYYYmmddHHMMSS_MD5.eml' (or eml.gz) to a
                              local directory.
  -c, --compact               Compact mbox file to file whose name is
                              formatted in 'mboxfilename_YYYYmmddHHMMSS'.
  -s, --split N               Split mbox file into several mbox files of
                              maximum N bytes size. This smaller files are
                              named 'mboxfilename.#' where # is an increment
                              number with auto leading zero if necessary.
  -a, --auto                  Automatic mbox files search and parse for
                              Mozilla Thunderbird client. The search is
                              performed in the current user directory.
      --with-localfolders     Force processing the 'Thunderbird local
                              folders' directory for all profiles selected.
                              Only used with 'auto' option.
      --force-user USER       Set the user name profile for Thunderbird
                              search. Only used with 'auto' option.
      --email-domain DOMAIN   Keep only corresponding accounts found in
                              Thunderbird. Only used with 'auto' option.
      --source-exclude REGEX  Exclude mbox files from Thunderbird list. This
                              is a insensitive regex list separated by comma.
                              Only used with 'auto' option.
  -w, --windows-format        Convert extracted eml files to windows format.
      --synchronize           Synchonize eml files from available emails list
                              and if 'auto' is set then keep only valid
                              Thunderbird directories.
  -z, --compress              Compress eml in gzip format and add extension
                              '.gz' to file name.
  -i, --with-invalid          Invalid emails are retained. This status is
                              defined when at least one of the 'date' or
                              'from' fields is missing from the header. The
                              date filters do not affect these emails. The
                              names are formatted in '00000000000000_MD5.eml'
                              (or eml.gz).
  -x, --with-duplicated       Duplicated emails are retained during
                              processing.
  -d, --with-deleted          Deleted emails are retained during processing.
                              This status is linked to the 'X-Mozilla-Status'
                              header field and therefore only available with
                              Mozilla Thunderbird email client.
  -u, --url URL               Url for the messages uploading process in eml
                              or gz file format. This option require option
                              'k' to be set to trigger the remote sending
                              process. It is independent of 'e' option.
  -k, --key KEY               Password used to secure exchanges with the
                              remote host.
      --age-min N             Select emails that have more than N days.
      --age-max N             Select emails that have less than N days.
      --date-before DATE      Select emails before the specified date. DATE
                              must be formatted in 'YYYY-mm-dd HH:MM:SS' or
                              'YYYY/mm/dd HH:MM:SS'. The parameters
                              'HH:MM:SS' are optional and would be set to
                              '00:00:00' if it is missing.
      --date-after DATE       Select emails after the specified date. Same
                              syntax as 'date-before'
      --timeout N             Set maximum time in seconds the remote
                              connection request is allowed to take. Used if
                              'u' option is set. WARNING: If defined to 0
                              then process could hang. (default: 600)
      --speed-limit N         Set maximum speed in bytes per second to send a
                              file. Used if 'u' option is set. (default: 0)
      --start-wait N          Delay process waiting to start in seconds.
                              (default: 0)
      --start-random N        Maximum delay before process start in seconds.
                              The random value is added to the 'wait'
                              countdown. (default: 0)
      --log-file FILE         Log what we're doing to the specified FILE.
      --log-maxfiles N        Maximum number of log files. Each log file size
                              does not exceed 1 MB. (default: 5)
  -v, --verbose [=N(=3)]      Verbose level for eml processing (N between 1
                              and 3, 3 is implicit). 1=ERROR, 2=WARNING,
                              3=INFO.
      --version               Display version number.
      --help                  Display command line options.

Examples

  • Simulate an mbox file or a full automatic Thunderbird processing

    mboxzilla -f inbox
    mboxzilla -a
    
  • Extract all messages to local folder as eml files

    mboxzilla -f inbox -o backup_dir -e
    
  • Extract all messages younger than 30 days to local folder

    mboxzilla -f inbox -o backup_dir -e --age-max=30
    
  • Upload all messages found in Thunderbird profiles to URL (see 'server/index.php') with :

    • EML compression to gzip = active
    • Emails age max = 365 days
    • Email profile filter = domain.tld
    • Mbox excluded = trash.,drafts.,junk.,templates.,(./)?private.
    • Include Thunderbird local folders = yes
    • Host URL = https://server_url/mails/
    • Secure key = _password
    • Synchronization = active
    • Upload max speed = 32768 B/s (256 Kb/s)
    • Verbose level = 2 (Errors and warnings)
    • Log file = mboxzilla.log
    mboxzilla -a -z --age-max=365 --email-domain "domain.tld"
          --source-exclude "trash.*,drafts.*,junk.*,templates.*,(.*/)?private.*"
          --with-localfolders -u "https://server_url/mails/" -k "_password"
          --synchronize --speed-limit=32768 -v 2 --log-file=mboxzilla.log
    

    OR

    mboxzilla settings.conf where configuration file contains :

       # mboxzilla configuration example
       auto=true
       compress=true
       age-max=365
       email-domain=domain.tld
       source-exclude=trash.*,drafts.*,junk.*,templates.*,(.*/)?private.*
       with-localfolders=true
       url=https://server_url/mails/
       key=_password
       synchronize=true
       speed-limit=32768
       verbose=2
       log-file=mboxzilla.log
    
  • Upload all messages found in Thunderbird profiles to a folder :

    mboxzilla settings.conf where configuration file contains :

       # mboxzilla configuration example
       auto=true
       compress=true
       email-domain=domain.tld
       source-exclude=trash.*,drafts.*,junk.*,templates.*
       with-localfolders=true
       output=C:\Users\username\Documents\emails
       extract=true
       synchronize=true
       verbose=2
       log-file=mboxzilla.log
    

    Web server requires:

    • a file to manage requests in https://server_url/mails/ (see index.php in "server" directory)
    • a subdirectory in "mails/" to store the exported files (see $target_dir value in index.php)
    • to set the key to decrypt eml files (see $key value)

Informations and advices

  • The mbox source files are read-only access and so are never modified.

  • If 'auto' option is set then output directory for Thunderbird is 'username/profile/name@domain.tld' or 'username/profile/Local Folders'.

  • The (not default) synchronization process works on all the eml files but for directories sync it's only applied for Thunderbird.

  • Remotely exported files are transferred using AES-256-CBC encryption mode

  • Thunderbird IMAP type accounts are ignored.

  • If an error occurred while parsing a mbox then its processing is aborted and goes to next one. In this case there is no files synchronization.

  • This software uses the following external libraries :

How to build

The build requirements are:

  • C++ compiler that supports C++11 regular expressions. For example GCC >= 4.9 or clang with libc++.
  • C++ Libraries : zlib, ssh2, ssl, curl

From linux do :

  • linux binary:
    g++ -Os -s -std=c++11 mboxzilla.cpp mbox_parser.cpp common.cpp easylogging++.cc -o bin/linux/mboxzilla -lcrypto -lcurl -lz -DELPP_NO_DEFAULT_LOG_FILE
    
  • windows executable:
    i686-w64-mingw32-g++ -static -Os -s -std=c++11 mboxzilla.cpp mbox_parser.cpp common.cpp easylogging++.cc -o bin/win32/mboxzilla.exe -lcurl -lpthread -lssl -lssh2 -lcrypto -lz -lws2_32 -lwldap32 -lwinmm -lgdi32 -DCURL_STATICLIB -DELPP_NO_DEFAULT_LOG_FILE
    

The Mbox_parser class can be freely used outside this project.