The Rub

AUTOMATICALLY SIMPLE SINCE 2002

Advanced Gmail Filtering for Linux Kernel Lists

Gmail search is great (as one would expect), but email filtering is not as powerful due to lack of support for multiple rules. This makes filtering email the way I would prefer quite difficult.

My approach to organizing email is zero-inbox based, meaning that instead of filtering my email into folders and reading the folders, I leave emails in my inbox that need to be read, and I don’t remove them from my inbox until they’re read processed. With this strategy, I typically have a small handful of emails in my inbox by the end of any given day, and I receive many hundreds of emails per day.

The first crucial point is to never delete anything; only archive. The decision to ‘delete’ is expensive; you have to carefully consider whether or not you will ever need the email. Sure, you can decide that in less than a second, but it adds up and has a much higher cost than the decision I prefer to make: does this email require me to do an action.

So, there are 3 types of emails: those which I do not need to read, those which I do need to read, and those which require action from me.

The first type, ideally, I never see. In Gmail mechanics, this means skip inbox and archive.

The second type are emails that contain information that is valuable enough to be aware of, but which I can archive after reading. These should land in my inbox, and I will archive them after reading.

The third type require action. If an action takes less than a minute or two or five, I do it immediately and then archive the email. If it takes longer, I leave it in my inbox until I have either transferred it to some other tracking mechanism, or, have completed the task.

Filtering is therefore responsible for identifying emails of low value and removing them from my inbox.

My Linux kernel development digression: kernel development consists of many mailing lists. The basic strategy seems to be: subscribe to lkml but ignore emails unless you’re named in to: or cc:, with some exceptions. Also, subscribe to subsystem lists as needed. Some lists should be read entirely, others should be filtered to some extent depending on volume. This all gets a bit custom, and depends on which part of the kernel you’re interested in.

The problem is that because of how Gmail implements filters, it is not easy to write rules which say “archive everything on a particular list except things that match this filter”. Every email is evaluated against every filter, and the order is not reliable. There is no concept of doing something in one rule and then referencing that something in a subsequent rule. It did actually work like this, but they seemed to have broken this behavior as a part of the new UI that was deployed in 2018.

The best I have been able to do results in two inboxes. One for flagged emails, one for emails that didn’t match a filter. So, I ended up with a “todo” label, which serves as a second inbox. As a result, my inbox contains emails which did not match any filters, and my todo label contains emails which my filters caught. This can be a bit convenient, as it gives a good way to see which additional filters may be needed.

Finally, I’ve found it better to use Gmail filtering’s import/export feature to keep my filters in version control rather than editing in the UI. This let’s me change my filters with impunity and without concern for uncontrolled changes.

However, I prefer yaml to xml and so I actually use gmail-yaml-filters. This project lets me define my filter rules in yaml, and then it will convert them to an xml file. I can then either set up sync (I didn’t get that to work), or, go into gmail filters, delete all filters, and then import the filters from xml.

Let’s get into some specific rules, which I think may be useful. These are in gmail-yaml-filters style.

# To me or my besties specifically, but not from bots (so, probably from a
# human)
-
  to:
    any:
      - [email protected]
      - [email protected]
      - [email protected]
      - [email protected]
      - [email protected]
      - [email protected]
  from:
    all:
      - [email protected]
      - [email protected]
      - [email protected]
      - [email protected]
  is: -chats # the to: rules picks up google chat logs; exclude them
  label: todo
  archive: true

This rule has a lot going on. First, a list of recipients is specified. Any email that is to: or cc: to any of those people will be caught by this filter, unless it’s sent from a list of excluded addresses which are mostly automated/bot emails. Also, chats are excluded - internally it appears that google hangout/chat logs are sent as emails and this rule will catch those if they’re not excluded. Lastly, the ‘todo’ label is applied and the email is archived (removed from inbox).

The value of this rule alone is huge. It allows me to track any emails on any email list that I’m subscribed to in which me or one of my peers are involved.

Let’s look at another:

# Emails from people I follow
-
  from:
    any:
      - [email protected]oundation.org
      - [email protected]
      - [email protected]
      - [email protected]
      - [email protected]
      - [email protected]
      - [email protected]
      - [email protected]
  label: todo
  archive: true

This one is similar to the last one, except it’s filtering on who’s sending the email. The list of people is similar as well. Basically, if I’m interested in seeing what someone is sending, I can add them to this list and all of their emails will end up in my todo box.

# Label all stable emails
-
  list: [email protected]
  label: lists-stable
  archive: true
# Stable review threads should go to todo
-
  subject:
    all:
      - "-3.18" # exclude 3.18
      - PATCH
      - stable review
  list: [email protected]
  label: todo
  archive: true

There are two rules here. The first labels and archives all emails sent to the stable list (the label in this case is just a convenience; I don’t actually go read the mailbox - but it can be handy occasionally to see all stable emails groups together). The second rule pulls out emails that have certain subject strings. This one is not perfect, but the desire is to pick up threads that contain the words ‘stable review’ as well as ‘PATCH’, unless ‘3.18’ is also present. Such emails end up in todo.

# LKML
-
  list: [email protected]
  label: lists-lkml
  archive: true
# Linux release announcements
-
  list: [email protected]
  subject:
    all:
      - Linux
      - -PATCH
      - -GIT
    any:
      - "4.4"
      - "4.9"
      - "4.14"
      - "4.18"
      - "4.19"
      - "4.20"
  label: todo
  archive: true

This rule took some work to get right. The first archives everything sent to lkml. The second pulls out release announcements threads based on subject line.

Incidentally, it would be really nice if gmail had a way of applying a filter action to a thread of an email instead of just the email. In this case, I have to filter by subject (and not by from:), because I want to catch Linus’s release announcements as well as any replies to the announcement. This rule has a carrying cost; the list of versions needs to be maintained - though once 5.0 is announced I suppose I can add in the next several years of versions in one fell swoop.

A few other types of rules:

# Suppress ci successful build notifications
-
  from: [email protected]
  subject: Successful!
  label: junk
  archive: true

Here’s an example of a rule which matches an email that I never want to see. Using from: in these types of rules is useful - then the replies (if from a human) will not be filtered.

# Label all ltp emails and send all to todo
-
  for_each:
    - todo
    - lists-ltp
  rule:
    list: [email protected]
    label: "{item}"
    archive: true

Using for_each, this is actually two rules. The first applies label ‘todo’ to all ltp emails. This is an example of a mailing list where I want to see all emails. The second iteration of the loop applies the label lists-ltp. I give every email in every list a list-X label for consistency.

This is still evolving, but this is what I’m using today to good effect. I’d really like to learn how to keep all emails in the inbox or in a single label, but I haven’t figured out how to do that yet.