w3c/webextensions

Proposal: allow storing metadata in a static ruleset

Opened this issue ยท 14 comments

Problem

The proposal is tightly linked to the changes announced by the Chrome WebStore team, i.e. allowing "fast track" review to extensions in the case when only static rulesets.

In the case of content blocking extensions, static rulesets is a result of a conversion from traditional ad blocking rules to the DNR syntax and in order to make debugging rules possible we need to be able to show what the source rule was. In the current AdGuard prototype extension we achieve this by packing "source maps" alongside the static rulesets.

However, updating static rulesets means updating the source maps, and this will disallow "fast track" for the extension.

Proposal

The solution would be to store the metadata within the static ruleset file. This is actually even easier to understand than the current approach that we're using now.

We checked Chrome and FF and at the moment there's no limitation on "unknown" properties used in the JSON file in Chrome, in Firefox it silently ignores rules with unknown fields. So with Chrome we can already pack metadata into the static ruleset file and be eligible for "fast track". The problem is that this is unreliable, in the future there might be new fields that will intersect with the ones that we're using, or simply the rulesets parser behavior can be changed.

So I'd propose to mention in the documentation that there's a reserved property name (metadata for instance) that can be used for storing additional metadata, whichever the developer wants to store there.

Example:

{
  "id" : 1,
  "priority": 100000,
  "action" : { "type" : "block" },
  "condition" : {
    "urlFilter" : "||example.org^",
    "initiatorDomains" : ["foo.com"],
    "resourceTypes" : ["script"]
  },
  "metadata": {
      "source": "||example.org^$script,domain=foo.com,important"
  }
}

Another example (a bit more complicated case, several source rules -> one DNR rule):

{
  "id": 1,
  "action": {
    "type": "redirect",
    "redirect": {
      "transform": {
        "queryTransform": {
          "removeParams": [
            "param1",
            "param2"
          ]
        }
      }
    }
  },
  "condition": {
    "urlFilter": "||example.com^",
    "resourceTypes": [
      "xmlhttprequest"
    ],
    "isUrlFilterCaseSensitive": false
  },
  "metadata": {
    "source": [
      "||example.com^$xmlhttprequest,removeparam=param1",
      "||example.com^$xmlhttprequest,removeparam=param2"
    ]
  }
}

Safari is also fine with unknown keys. Seems fine to me, though I would suggest a name like metadata since Meta is a company name now.

Suggesting comment because both meta and metadata are meaningful terms in programming that unequivocally suggest that this section is somehow meaningful for the API. The name should explicitly state its lack of any impact on the rule, regardless of whether it's used by some company or not.

comment implies storing a string value. Ideally, I'd like to be able to store structured data in that field.

@xeenon thank you, I've updated the text and changed the field name to metadata.

What about userdata?

As for "metadata", Chrome creates _metadata folder in extensions from the web store and unpacked ones with static DNR rules, which kind of "suggests that this section is somehow meaningful for the API."

Tbh, I am fine with any name as long as it allows to store some structured data there :)

Even if one can, it would be nice to not add unrecognized metadata to a ruleset file. That needs to be loaded and parsed at some point.

Hi @Rob--W, let me please comment on some points from the meeting minutes

  • [rob] Sounds like the only reason for this request is that the Chrome Web Store review process favors updates with only DNR ruleset changes. CWS could also adjust their process to permit specific comment files.

CWS review process is indeed the main reason for the proposal. On the other hand, fast track review for DNR-only changes is a sensible thing, there's a chance that it will be implemented by other stores in a similar fashion. Therefore, having a standard way to store metadata could be useful to the teams that run these stores.

If performance is a big concern, additional restrictions on metadata format can be imposed, for instance store it as unstructured single string or store metadata in a separate file. Ultimately, everything is fine as long as every browser vendor supports it.

  1. Most people agreed on the field name comment.
  2. Firefox agreed to change the parsing behavior and ignore unknown fields.
  3. Just in case, we need to confirm that this is the default parsing behavior in Chrome and Safari.

I filed https://bugzilla.mozilla.org/show_bug.cgi?id=1886608 to track this.

Note: @ameshkov confirmed in person that the intended use is to fetch() the JSON file.
There are no DNR APIs to fetch the (parsed) static rules directly. The updateDynamicRules/updateSessionRules are currently throwing when an unrecognized key is passed (in Chrome and Firefox, like any other extension API method). This enables feature detection, and is not changing. @xeenon said that Safari does currently not throw when a rule has unrecognized properties (in general extension API methods accept unrecognized properties without throwing).

Firefox 128 will now silently ignore unrecognized keys in static rules and accept rules whose recognized keys have valid values. This was implemented in https://bugzilla.mozilla.org/show_bug.cgi?id=1886608

This means that rules such as condition: { domains: ["example.com"] } will be equivalent to condition: {}, because "domains" is not recognized because it was deprecated in favor of requestDomains, which Firefox supports.

Firefox 128 will now silently ignore unrecognized keys in static rules and accept rules whose recognized keys have valid values.

IMHO, a global consensus would be beneficial to "silently ignore unrecognized keys" in any settings JSON and/or similar objects, which would eliminate the need to request such policy individually in each area.

@erosman I agree. That has been Safari's policy.

Firefox 128 will now silently ignore unrecognized keys in static rules and accept rules whose recognized keys have valid values.

IMHO, a global consensus would be beneficial to "silently ignore unrecognized keys" in any settings JSON and/or similar objects, which would eliminate the need to request such policy individually in each area.

The consensus is to avoid hard errors when an unrecognized manifest key is found: #14

Chrome and Firefox do currently print warnings in the developers UI when an unrecognized property is found, to ease development. The intentional decision to not trigger errors nor warnings for DNR enables the use of unrecognized properties at the expense of losing the ability to detect misspelled property names.