ericcornelissen/webmangler

HTML attribute with different casing not mangled consistently

Opened this issue · 1 comments

Bug Report

Description

HTML attributes are currently not mangled correct if the same attribute is used with different casings. Since HTML attributes are case insensitive, an attribute with non-standard casing should be mangled identically to the same attribute that has standard casing.

Steps to Reproduce

  1. Have a file that contains HTML attributes, e.g. a HTML file with this content:
    <style>
    [Data-foo] { color: red; }
    .example::before { content: attr(data-Foo); }
    </style>
    <div data-foo>Hello world</div>
    <div data-Foo>Bonjour le monde</div>
    <div data-FOO>Hallo wereld</div>
  2. Mangle the file from step 1 with the appropriate built-in language plugin (the version available as of 12ef971) as well as the HTML Attribute Mangler. Configure the attrNamePattern such that it mangles uppercase letters, e.g. "[A-Za-z-]+".
  3. Observe the mangled file has each data-foo attribute mangled differently. For the example HTML above it could look like:
    <style>
    [data-a] { color: red; }
    .example::before { content: attr(data-c); }
    </style>
    <div data-b>Hello world</div>
    <div data-c>Bonjour le monde</div>
    <div data-d>Hallo wereld</div>

Related

The question of case sensitivity should be addressed at the level of language plugins. The reasoning for this is as follows:

  1. Mangler plugins controlling case sensitivity should ideally be avoided as it could be different from language to language. 1
  2. The language plugins do not currently "control" the behavior of the core. We want to keep it that way, hence the core cannot address case sensitivity.
  3. Language plugins can handle case sensitivity by interacting with the core using a normalized casing. E.g. all strings returned by .findAll are returned lowercase. Then .replaceAll makes sure strings get replaced regardless of casing.

The drawback of this approach are

  1. Case sensitivity must be considered in each language plugin. Unfortunately this is a requirement as per point 1.
  2. Case normalization must be standardized across language plugins.

Footnotes

  1. Take for example JSX, where most attributes are case sensitive (source).