igniterealtime/openfire-monitoring-plugin

Archiving functionality fails with private MUC messages

Closed this issue · 1 comments

The monitoring plugin logs groupchat (MUC) messages (also private group messages, as defined in https://xmpp.org/extensions/xep-0045.html#privatemessage) after they have been delivered to the room occupant(s) using a MUCEventListener.

Because the message that is delivered is stored, the private messages that are archived contain the bare/real JID of the user the message was send to, and the occupant JID of the user that send the message. As opposed to the occupant JID of the destination user.

This causes private group chat messages to appear in archive queries for the real JID, and no messages to appear in queries for the occupant JID.

In addition to appearing in the wrong results, it also exposes the real JID of the destination user, even though the message was send to an occupant JID, and the real jid was unknown to the sending user.

It would be better if both the real JIDs, and the occupant JIDs of the sender and receiver were stored. It would also be better if the stanza was archived before it was delivered, with the original occupant JID as the ‘to’ value.

The private messages shared in a groupchat (PM) are currently stored as if they were regular one-on-one (1:1) messages. During this, the context of the room in which the PM was shared is lost. This means that, from the persisted data, it is no longer possible that a message was once shared as a PM (as opposed to a 1:1).

Although it is desirable to retrieve PMs as part of a group chat message archive, the existing behavior (PMs retrieved through the personal archive) could also have its merits - and might be depended on in various use-cases (FastPath maybe - but also third-party implementations that we have no knowledge about). There is, however, a security concern here: the existing implementation will, always, expose the real JID of the other peer in a PM - even if the room is configured to be semi-anonymous (semi-anonymous rooms have other privacy concerns of this nature, making the impact of this particular issue less critical).

A seemingly obvious fix is to start including a column in Openfire's ofMucConversationLog table that flags the message as a private message. That way, the monitoring plugin could be modified to query for all messages from that table that either have a null value for the new column, or one that matches the bare JID of the local user. A significant downside here is that this causes private messages to be retrieved by everyone, as long as the monitoring plugin is not updated to include the additional check (as default behavior is to simply return everything). Of lesser concern is that other / third party users of that same table might have similar issues.

An alternative solution would be to use the database tables that are managed by the Monitoring plugin itself: ofMessageArchive (and additionally ofConversation and ofConParticipant). These tables should also contain the messages shared in group chats (duplication is occurring). These tables also contain the messages used for personal archives.

It is reasonable to expect that the database tables managed by the Monitoring plugin are used solely by the Monitoring plugin. The concerns of changing content / structure having an undesired compatibility effect (as I noted above for ofMucConversationLog) should not apply.

The current incarnation of the Monitoring plugin uses the ofMessageArchive tables only for that: personal archiving. However, up to a few years ago, the same tables were also used to process groupchat archiving. We moved away from them, as we found that data was also stored by Openfire 'core' (ofMucConversationLog) and having this duplication was silly. The long game was to eventually move everything over - which never happened (or has not happened yet as some might claim).

If we move the implementation back to ofMessageArchive and friends (which comes with some risk, as subtle functional changes might occur, due to different database identifiers being used, ordering being slightly different, etc, etc), I can identify a couple of ways where we could augment functionality to include PMs being returned as part of group chat message archive request.

First: ofMessageArchive currently stores the 'to' and 'from' JID of a message that it stores, but for a groupchat message, the 'to' JID has the nickname of the sender. This seems to go unused. It makes more sense to include the nickname of the recipient. If we'd do that (and purge the existing content), that could be used to identify PMs. We could lookup the nickname of participants through the ofConParticipant table, that's already being joined with in code. However, a concern is that nicknames can change. To avoid someone being able to query for PMs of a previous participant that used the nickname, we'd have to verify that the message was sent during the time that someone held a particular nickname. That's do-able (participants have join and leave dates registered), but I fear that this is error-prone: I'm not sure if we can be sure that the timestamps used in both tables are properly synchronized across the various database systems that we use, and between clustered vs non-clustered use-cases. Even if they are currently safe, it is dangerous to assume that this will remain to be the case in the future Also, when 'leave' timestamps are not recorded for some reason (eg: hard crash of Openfire), it's difficult to write fault-tolerant code that would ensure that messages aren't leaked.

An alternative solution would be to add a column to ofMessageArchive that is either null, or has the (bare?) real JID of the recipient, when the message was a MUC PM. I've yet to identify major downsides to that, other than that it requires a database structure change.