cyrusimap/cyrus-imapd

Conversations code breaking with missing basecid after copy and rebuild

Closed this issue · 0 comments

brong commented

See #4641

A user was seeing blank rows in the Fastmail UI. Symptom was "Email/query returns emailIds that Email/get says notFound for".

I discovered that there were messages with a missing basecid - the same GUID was in two separate folders, and in one it had a basecid, in the other not. This was for a split conversation. I quickly realised that the fix I put in a while back for copying the CID for the same GUID wasn't also copying the basecid.

So, I fixed that - but realised to get the basecid, we needed to store it in every G key. So I added that (only if you have uuid folders, I figured that was better than doing some odd/even version magic).

Then I found more issues when testing. Turns out, if there was more than one copy, the "B" keys weren't being cleaned up correctly, because the exists count kept finding the other zeroed out record.

Also, when you zero, we should also nuke all the newcid records on the #splitconversations folder. There was an XXX comment to that effect, but it contained incorrect code, so I fixed that too.

In all, I'd say I've spent a solid 8 hours on this. Damn. But it's looking really solid now, and the test case includes copying the same message to two folders and the CIDs and BASECIDs matching across the lot now.

When we update servers, the G key parser is actually backwards compatible, so it's going to be safe to downgrade, though not safe to upgrade again if we change the G key datastructure format AND reuse the version number, so let's don't do that.

I also discovered that ctl_conversationsdb -R writes the G keys a few times, but I believe it's because it's working on two copies of the database so I'm not too stressed about that (yep, I scatted syslogs in places)

The PR also adds cid=<> logging to append and expunge. We already log it for touched, so it just adds a tiny bit more logging and you can see CID changes more easily.