mbrt/gmailctl

Why does gmailctl split this kind of "or" filter into separate rules?

Opened this issue · 19 comments

I've noticed gmailctl likes to split some "or" conditions into separate filters where I would've expected a single Gmail filter. It's a cosmetic thing, but affects a lot of my filter rules and ends up bloating/obfuscating my Gmail filters list quite a bit in practice.

For example:

{
  filter: {
    or: [
      { list: 'list1.example.com' },
      { cc: 'list1@example.com' },
    ],
  },
  actions: { archive: true },
},

Will produce:

Filters:
--- Current
+++ TO BE APPLIED
@@ -1 +1,10 @@
+* Criteria:
+    query: list:list1.example.com
+  Actions:
+    archive
 
+* Criteria:
+    query: cc:list1@example.com
+  Actions:
+    archive
+

instead of a single query list:list1.example.com OR cc:list1@example.com.

Is there a reason for that?

mbrt commented

Big filters make Gmail flaky (i.e. filters get applied inconsistently). This doesn't happen if you split them up into multiple filters. Gmailctl does this aggressively to prevent the problem.

Re bloating/obfuscating: Anything non-trivial is almost impossible to understand from the Gmail settings UI. This was one of my main motivations for starting the project. I don't think that keeping filters from splitting will make it any more clear.

What about some hint I could add to a rule config to request not splitting, or some heuristic to not split tiny rules?

I understand it's a good safe default but for these cases it actually makes my filters much much harder to read and reason about in practice, if understandability in the Gmail settings UI is a motivating factor.

What about some hint I could add to a rule config to request not splitting...?

Sent #373 with a proposal for that. @mbrt could you take a look and see what you think about the direction of it and naming before I put energy into docs & tests for it?

mbrt commented

I'm trying to keep gmailctl as simple as possible, so I don't normally add options if not strictly necessary. The only one we have right now is isEscaped.

It's my experience that people ask for features when they first use the project thinking they will need them, but then later on they actually don't use them. This is also the first time it was asked, so I'm double reluctant.

Would it be possible for you to keep this on your fork for a few months and see whether it's actually what you want and makes the usability better?

I'm open to discussing implementation afterwards. Glancing at your PR, it seems easy enough and going in the right direction.

Thanks, absolutely! Agreed about simplicity, and I wouldn't want to add a weird option on a whim if we'd end up regretting the mechanics or naming of it.

That override scratches my itch but does feel a little hacky. I honestly feel like some heuristics to automatically avoid splitting some trivial cases would be better. Would you be up for having a little more discussion here about the mechanics of it and seeing if we can come up with something better to merge? I could also share some more specifics about my use cases and why this type of split has been getting so annoying, if it helps.

mbrt commented

Oh yes, sure! I'm all in for making the heuristics better.

Great, so to start somewhere concrete: whatever heuristics we ended up with, would you be fine if it that example I gave in the issue description didn't split?

In changing the behavior it's also worth considering that existing users might see confusing extra diffs next time they apply changes after updating. That's probably not worth worrying about too much but might warrant some note in the documentation or something, recommending to check diff when you update. Or OTOH gmailctl could do something clever to try to prefer whichever behavior minimizes the diff for each existing filter?

mbrt commented

Yes, I think it's fine to not split your example.

In changing the behavior it's also worth considering that existing users might see confusing extra diffs

It's precisely my worry, but we did change the logic a bit in the past, so it would be fine if things don't get too confusing.

gmailctl could do something clever to try to prefer whichever behavior minimizes the diff for each existing filter?

Uhm, yes minimizing diffs sounds also nice, but I wonder how complicated that would be. It requires potentially splitting filters up at the time we do the diff (rather than earlier). We currently just do a best effort similarity match to minimize diffs, but perhaps we can add logic to that.

Okay great, so as a start I'd propose to not split a filter that's <10 query terms and <100 values, or some numbers in that ballpark.

For example:

  • { or: [{ list: x } for x in ['a', 'b', 'c]] + [{ cc: x } for x in ['a', 'b', 'c']] } wouldn't split because as a gmail query it's 2 terms and 6 values
  • { or: [{ list: x} for x in ['a', 'b', … 'z'] * 4] } would split because it's only 1 term but 26×4=104 values
  • { or: [{ and: [{ to: 'a' }, { not: { and: [{ cc: 'b' }, { cc: 'c' }, { or: [ … ]}] } }] }] } might if you added enough items down in the "…", because as a gmail filter it would have enough terms of "to:", "cc:", etc in the filter to potentially hit the limit of 10

That should split most items that are either computationally heavy or visually awful to read. WDYT?

mbrt commented

If the goal is clarity, there are a few more things to consider:

The filter criteria itself is not a single query string. It's represented in Gmail as this object, which maps nicely to what you see in the settings UI.

A filter like from:a OR subject:b can be represented by a single filter using the query field like this:

{
  "query": "{from:a subject:b}"
}

or by two simpler filters:

{
  "from": "a"
}
{
  "subject": "b"
}

Additionally, we are already grouping filters of this type: from:a OR from:b OR from:c to become: from:{a b c}, which is a single compact filter. This is how we're currently trying to not go overboard with the number of filters we split into.

I'm not sure that the first version is more clear than the second to be honest. Keeping things together until they become too large may not give the best results, because everything is dumped into the query field, instead of using the "purpose fields".

Right, for these heuristics I'm talking about, I'm talking in terms of the final Gmail filter syntax, and it might be a little subjective how some counting gets applied.

I'm not claiming that one trivial "{from:a subject:b}" filter rule is a dramatic improvement over separate from and subject rules. Here I can get into some more specific use cases to illustrate more clearly why some of these splits are so annoying in practice...

Cases

One aspect that gets annoying is when it's splitting across multiple dimensions, so that one gmailctl rule like this:

{
  filter: { or: [{ from: 'a' }, { subject: 'b' }] },
  actions: { labels: ['X', 'Y'] },
}

gets split into four Gmail filters instead of two:

  • from: 'a' -> label: 'X'
  • from: 'a' -> label: 'Y'
  • subject: 'b' -> label: 'X'
  • subject: 'b' -> label: 'Y'

Another scenario making the splits seem really unmaintainable in practice is where I've found mailing list emails surprisingly hard to filter and made a compound helper for them. list: will catch some cases where I get the email via the list, to: and cc: seem to catch other cases like To: me, somelist or To: somelist, List: someparentlist. So I set up a compound helper for to: somelist OR cc: somelist OR list: somelist and use it for some very long lists of lists. Where those get too long to be "easily readable" like to:{a b c d e f g} OR cc:{a b c d e f g} OR list:{a b c d e f g} I'd much prefer to manually split them along logical lines in the other dimension like

  • to:{a b c d} OR cc:{a b c d} OR list:{a b c d} -> someaction
  • to:{e f g} OR cc:{e f g} OR list:{e f g} -> someaction

than to have them automatically split along arbitrary structural dimensions into

  • to:{a b c d e f g} -> someaction
  • cc:{a b c d e f g} -> someaction
  • list:{a b c d e f g} -> someaction

Impacts of splitting

Besides just making the list of gmail filters hard to directly skim through and understand, there are two aspects that get extra frustrating when there are too many splits:

  1. Ordering: I believe I've seen for whatever reason that when it splits rules it doesn't necessarily keep them ordered logically, making it harder just to read the diffs.
  2. Manually applying in Gmail: As I'm managing filters it's useful to check them in situ in Gmail, clicking "edit" on the new/modified filters to see which of my emails they match in practice and sometimes to manually apply to existing emails. These processes get way more awkward where there are unwanted splits.
mbrt commented

I understand what you mean with "annoying splitting across arbitrary dimensions".

Some more information to chew on:

  • You don't need to:A OR cc:A, as to:A is effectively an alias for TO, CC and BCC. Yes, both list and to are sometimes necessary, as he list: field is not always populated somehow. I don't know enough about emails to understand why.
  • Ordering is impossible to set through Gmail APIs in the way we manage the filters. We always delete and re-create filters when they change, because there's no way to distinguish which is which. This means we end up with updated filters at the bottom of the list.
  • You can debug what a filter does in several ways: gmailctl debug gives you handy links to Gmail search. You can add tests to see whether what you're doing makes sense.

After a certain size I found any interaction with the Gmail settings web UI just too hard. This is really why gmailctl was born. It's because understanding and managing filters that way was just a lost cause for me. I'm not really sure the changes you're proposing will make it a lot easier TBH. You will still end up with large filters and finding them in the UI will be hard.

Thanks, that helps a good deal actually. Just removing the redundant cc: shaved hundreds of lines off my corp filter config per gmailctl's diff! 😲

It looks like the gmailctl debug links to the non-split versions of queries even when the versions pushed as filters will be split? That's also really helpful, even if the output is a little hard to navigate. I wonder if gmailctl could have a mechanism when pushing updated filter rules to dump links to just those queries to cli for manual inspection. That would help for use cases where I'm wanting to reprocess old emails using the updated filters (the "Also apply filter to matching conversations" toggle in Gmail's filter dialog).

Still, I feel like avoiding frivolous splits based on some heuristics can't hurt much even if it doesn't help hugely either. Is the concern about adding complexity to the project or something about making user experience more complicated?

mbrt commented

It looks like the gmailctl debug links to the non-split versions of queries even when the versions pushed as filters will be split?

Correct.

I wonder if gmailctl could have a mechanism when pushing updated filter rules to dump links to just those queries to cli for manual inspection.

Yes, I believe this shouldn't be too hard to implement.

Still, I feel like avoiding frivolous splits based on some heuristics can't hurt much even if it doesn't help hugely either. Is the concern about adding complexity to the project or something about making user experience more complicated?

I wanted to make sure this wasn't an XY problem. I also would like to keep the behavior as consistent as possible, as any changes there will generate diffs for many existing users.

I think it's fine not to split small filters, although I don't know what the right limit would be. I would set the limit small though.

BTW, is there anything we could do to improve the visibility of the debug command, or perhaps a way to integrate it better with the diff?

BTW, is there anything we could do to improve the visibility of the debug command, or perhaps a way to integrate it better with the diff?

Absolutely, I didn't realize the debug command existed until you mentioned it, then saw there was a single mention of it in the README. In general just sprinkling around a few pointers to it in related commands/workflows might've helped, and I might also suggest renaming it (to something like inspect or analyze?) since debug kinda sounds like something lower-level than I would've been looking for here. If you want we could fork off an FR for making it more discoverable and dig into more specifics on that.

For this issue on splitting, I'm happy to give it some time and decide later what to change if anything. The especially frivolous splits still very much annoy me but we probably wouldn't want to rush into any disruptive behavior change.

mbrt commented

Sounds good, thanks! If you have any concrete improvements on debug feel free to propose. I'm e.g. open to renaming to inspect as it seems indeed more fitting.

Not sure whether I would keep both for a few releases (+ warning) and then delete debug, but perhaps I'm being paranoid.

This issue is stale because it has been open for 30 days without activity.
This will be closed in 7 days, unless you add the 'lifecycle/keep-alive' label or comment.

(I'm late to the party, but my 2c:) I recently ran up to the 1000-filters limit in Gmail, and I found that part of that was because of the kind of filter-splitting that this issue is describing. If my ORs hadn't been split, that could have eliminated hundreds of filters, and I'd still have plenty of headroom before hitting 1000 again.

(Ack that I am almost definitely an extreme edge case.)

mbrt commented

@jameskoh oh that's interesting. I wasn't aware of limits in the number of filters. You're the first one I hear hitting that BTW.

I guess we can reopen this, even though I don't know yet of a good heuristic for when to avoid splitting. Perhaps the only one is #372 (comment), i.e. avoid splitting tiny rules.