Only test under MacOS now.
You should have brew
installed. If not, you can download from here
brew install isync
brew install mu
Use gpg
to encrypt email password.
brew install gnupg gnupg2
- create a file and write your email password in it. For example:
echo my-password > ~/pass
- use gpg encrypt the file with password:
gpg -c -o ~/pass.gpg ~/pass
- then config
~/pass.gpg
to.mbsyncrc
‘sPassCmd
, see below
- create a folder,
mkdir ~/Maildir
- create a file,
~/.mbsyncrc
fullill .mbsyncrc
with your email config, for example:
# your email IMAP config, setup your account
IMAPAccount spike
Host outlook.office365.com
Port 993
User <your-email-address>
# your need to use gpg to decript your pass if you have encrypted
PassCmd "gpg -q --for-your-eyes-only --no-tty -d ~/pass.gpg"
# if you do not encrypt your pass, you can use this:
# Pass <your email password>
SSLType IMAPS
SSLVersions TLSv1.2
# Remote Store
# set `Far`(remote) store
IMAPStore remote
Account spike
# set `Near` (local) store
MaildirStore local
# where to save your email
# if folder not exist, you need to create it
Path ~/Maildir/spike
# where to save your inbox
Inbox ~/Maildir/spike/INBOX
SubFolders Verbatim
# Channels
Channel spike
# refer to IMAPStore name
Far :remote:
# refer to MaildirStore name
Near :local:
Patterns *
Expunge None
CopyArrivalDate yes
Sync All
Create Near
SyncState *
For more info, you can find here.
mbsync -a
This may take a long time if you have lots of emails.
if your emails has CJK characters,maybe you should set this before:
export XAPIAN_CJK_NGRAM=1
then
mu init
mu index
By default, mu will try to index your email under ~~/.Maildir~
git clone https://github.com/Spike-Leung/invoice-compress.git
cd invoice-compress
PUPPETEER_PRODUCT=firefox yarn
yarn extract
By default, this script will extract all invoice in current month.
But you can also specify a time range to extract. For example:
yarn extract 20220401...20220502
- use
isync
to download all emails - use
mu
to index, find, and extract pdf in emails (in the script, mu find the email subject with “发票”, you can fork this repo to custom yours) - use
mu
to find invoice email without pdf, but have a download links - use
mailParser
parse the email, and usejsdom
query the download links - then use
http
andhttps
node module to download pdf from links - if there are links can not download directly, write a adapter which use
puppeteer
or something else to find the download links
- [ ] calculate invoice sum