lefcha/imapfilter

contain_field produces false positives

oaken-source opened this issue · 2 comments

I have recently noticed that my searches for the X-Spam-Field header are producing false positives. This is the minimal reproducing example from my config.lua:

local account = IMAP {                                                       
    server = 'owa.example.com',                                           
    username = 'john.doe',                                              
    password = 'secret'),                       
    ssl = 'ssl3',                                                                
}                                                                                
                              
results = account.INBOX:contain_field('X-Spam-Flag', 'YES')                  
for _, msg in ipairs(results) do                                                 
    mbox, uid = unpack(msg)                                                               
    spam = mbox[uid]:fetch_field('X-Spam-Flag')                                  
    print(spam)                                                
end        
results:move_messages(account['Junk-E-Mail'])                                                                      

There is one message in my inbox with the X-Spam-Flag field set to YES, but the UID SEARCH ALL HEADER X-Spam-Flag "YES" imap command generated by imapfilter consistently produces three results. the for ... end block above demonstrates that the field is indeed set to NO not YES. This is the info output of imapfilter -n:

Fetched field "X-Spam-Flag" of john.doe@owa.example.com/INBOX[200695].
X-Spam-Flag: NO
Fetched field "X-Spam-Flag" of john.doe@owa.example.com/INBOX[200714].
X-Spam-Flag: NO
Fetched field "X-Spam-Flag" of john.doe@owa.example.com/INBOX[200715].
X-Spam-Flag: YES
3 messages moved from john.doe@owa.example.com/INBOX to john.doe@owa.example.com/Junk-E-Mail.

and this is the relevant information from the debug log:

getting response (4):                                                                                                                                   
* OK The Microsoft Exchange IMAP4 service is ready.
sending command (4):
1000 NOOP                                                                             
getting response (4):
1000 OK NOOP completed.                                                               
sending command (4):
1001 CAPABILITY
getting response (4):
* CAPABILITY IMAP4 IMAP4rev1 AUTH=PLAIN SASL-IR UIDPLUS MOVE ID UNSELECT CHILDREN IDLE NAMESPACE LITERAL+
1001 OK CAPABILITY completed.
sending command (4):                                                                  
1002 LOGIN "john.doe" *
getting response (4):                                                                 
1002 OK LOGIN completed.
sending command (4):
1003 CAPABILITY      
getting response (4):   
* CAPABILITY IMAP4 IMAP4rev1 AUTH=PLAIN SASL-IR UIDPLUS MOVE ID UNSELECT CLIENTNETWORKPRESENCELOCATION CHILDREN IDLE NAMESPACE LITERAL+
1003 OK CAPABILITY completed.                                                         
sending command (4): 
1004 NAMESPACE                                                                        
getting response (4):
* NAMESPACE (("" "/")) NIL NIL
1004 OK NAMESPACE completed.
namespace (4): '' '/'   
sending command (4):       
1005 SELECT "INBOX"
getting response (4):
* 103 EXISTS
* 0 RECENT
* FLAGS (\Seen \Answered \Flagged \Deleted \Draft $MDNSent)
* OK [PERMANENTFLAGS (\Seen \Answered \Flagged \Deleted \Draft $MDNSent)] Permanent flags
* OK [UNSEEN 102] Is the first unseen message
* OK [UIDVALIDITY 14] UIDVALIDITY value
* OK [UIDNEXT 200717] The next unique identifier value
1005 OK [READ-WRITE] SELECT completed.
sending command (4):
1009 UID SEARCH ALL HEADER X-Spam-Flag "YES"
getting response (4):
* SEARCH 200695 200714 200715
1009 OK SEARCH completed.
sending command (4):
100B UID FETCH 200695 BODY.PEEK[HEADER.FIELDS (X-Spam-Flag)]
getting response (4):
* 98 FETCH (BODY[HEADER.FIELDS (X-Spam-Flag)] {19}
X-Spam-Flag: NO
 UID 200695)
getting response (4):
100B OK FETCH completed.
sending command (4):
100D UID FETCH 200714 BODY.PEEK[HEADER.FIELDS (X-Spam-Flag)]
getting response (4):
* 101 FETCH (BODY[HEADER.FIELDS (X-Spam-Flag)] {19}
X-Spam-Flag: NO
 UID 200714)
getting response (4):
100D OK FETCH completed.
sending command (4):
100F UID FETCH 200715 BODY.PEEK[HEADER.FIELDS (X-Spam-Flag)]
getting response (4):
* 102 FETCH (BODY[HEADER.FIELDS (X-Spam-Flag)] {20}
X-Spam-Flag: YES
 UID 200715)
getting response (4):
100F OK FETCH completed.

Is this a misbehaving IMAP server? is there anything I can do on my end to work around this?

This specific IMAP server always caused problems. I remember people reporting that searching on that server didn't work at all in some cases, and they had to use the match functions (instead of the contain ones).

If it's only false positives, you could verify them by fetching the header field as you do, but if there are also false negatives, then it's better to fetch for all messages.

I managed to get more reliable results by replacing

results = account.INBOX:contain_field('X-Spam-Flag', 'YES') 

with

results = account.INBOX:match_field('X-Spam-Flag', '^YES$')

thanks for the advice!