lefcha/imapfilter

[support needed] Deduplicating emails

Opened this issue · 2 comments

Hi all, sorry to ask a question that was asked, but i did not find a solution:
I am using an imap server that offers only one INBOX but several mailaliases. In case a mail comes in with two or more aliases, this mail is duplicated for each of the aliases.
What i can do with imapfilter is moving and copying to other subfolders depending on the to or cc field, but this moves all the messages.
So i need a way to first dedup the messages. How can i achieve this ?
What i found by searching is a extension that covers a similar case:

messages = myaccount.INBOX:select_all()
results = Set {}
for _, message in ipairs(messages) do
mailbox, uid = table.unpack(message)
messageId = mailbox[uid]:fetch_header('Message-Id')
if seen[messageId] then
table.insert(results, uid)
else
seen[messageId] = true
end
end
results:delete_messages()

i tried it but this won't run because i am obviously missing something. can anyone explain what i am missing

I think this example has some assumptions and missing parts, a more correct and complete would be this one (ref: #106 (comment)):

seen = {}
duplicates = Set {}
results = account["Inbox"]:select_all()
for _, message in ipairs(results) do
        mailbox, uid = table.unpack(message)
        messageId = mailbox[uid]:fetch_field("Message-Id")
        -- Remove prefix to ignore Id/ID difference.
        messageId = string.sub(messageId, 12)
        if seen[messageId] then
                table.insert(duplicates, {mailbox, uid})
        else
                seen[messageId] = true
        end
end
duplicates:mark_seen()
duplicates:move_messages(account["dups"])

But instead of doing mark_seen() and move_messages() at the end, you can do:

duplicates:delete_messages()

So i need a way to first dedup the messages

I wonder if running mail-deduplicate first is an idea?