loot/libloot

Refactoring the util: statements to avoid copy-pasting

Infernio opened this issue · 1 comments

The Idea

As it stands, util: statements take the entire tool name:

- name: 'Something.esp'
  clean:
    - crc: 0x832B2C99 # v1.0
      util: 'SSEEdit 3.2.2'

This leads to copious amounts of copy-pasting, more than 500 times in the skyrim masterlists alone.
While the information given by the version itself is useful (3.2.2 vs 3.2.84, for example), the prelude (SSEEdit / TES5Edit) is already implied by the masterlist we're in (the skyrimse masterlist clearly implies SSEEdit, etc.).

The idea is to place a definition at the beginning of each masterlist that specifies the xEdit tool to be used:

- &xEdit
  name: SSEEdit
  display: '[SSEEdit](https://www.nexusmods.com/skyrimspecialedition/mods/164/)'

(the definition given above is just a mockup, the final one could obviously look different)

That way, the tools do not have to be hardcoded and future masterlists can specify their xEdit tool without any code changes.

The *dirtyPlugin, *quickClean and *noAutomaticFix messages would then be edited to accept just a version instead of the entire tool name in the util: field:

- name: 'Something.esp'
  clean:
    - crc: 0x832B2C99 # v1.0
      util: '3.2.2'

The tool name would be read from the *xEdit definition by LOOT and automatically prepended in front of the version given by util:.

Upsides & Downsides (with some rebuttals from the discussion below them)

Upsides

  • Avoids copypasting in the future, as only one definition of the xEdit tool will have to exist now
  • By extension, avoids typos in the future
  • Allows the removal of more than 500 occurrences of the words SSEEdit, TES5Edit, etc.
    • That might not be a big enough removal to justify editing all master lists

Downsides

  • Would have to be implemented within LOOT's code
  • Each master list would have to be edited to add the &xEdit definition and to remove the tool from each occurrence of util:
    • The latter can be accomplished through a search and replace, as each util: statement starts with the same prelude
  • xEdit would have to be updated to generate definitions with this new style
    • Elminster has already noted that this is no problem for him, as he has to change that part of xEdit regardless to add support for *quickClean

After thinking about this some more, I'm going to close this as wontfix, here's my opinions on the upsides and downsides raised.

Upsides

Avoids copypasting

What's wrong with copy/pasting? The repetition makes it easy to not notice a mistake, but that comes under the next upside. I've not found it to be a bottleneck or paper cut when adding dirty info, but I do so rarely, so maybe I'm missing something there.

Avoids typos

I can understand this, but is it a problem that's actually been encountered?

Removing duplicate content

I don't think there's a "big enough" threshold for editing all the masterlists, it's not something that needs justifying. However, I did consider what effect all the repetition of SSEdit, etc. has on the sizes of the masterlists. Looking at the Skyrim and Skyrim SE masterlists because they're the most developed:

  • Skyrim has 281 matches for the regex util: '?TES5Edit . Removing the TES5Edit part reduces the masterlist size from 973,355 bytes to 970,826 bytes (-2529 bytes, 0.3%). When gzipped (as they are when downloaded), the size goes from 181,678 bytes to 181,594 bytes (-84 bytes, 0.05%).
  • Skyrim SE has 401 matches for the regex util: '?SSEEdit . Removing the SSEEdit part reduces the masterlist size from 310,118 bytes to 306,894 bytes (-3224 bytes, 1%). When gzipped (as they are when downloaded), the size goes from 44,692 bytes to 44,616 bytes (-76 bytes, 0.2%).

I think those numbers show that the direct benefits of removing utility names are insignificant.

Downsides

Implementation in code

This would be the only case where LOOT performs implicit string concatenation, and it does so non-locally too (as in &xEdit would be defined very far from where it's used).

  • The non-locality makes it a pain to handle. Parsing metadata is roughly a two step process: the first converts a YAML string into a syntax tree of data structures, and the second converts those into LOOT's semantic metadata structures. The second conversion happens on a node-by-node basis, so everything needs to be self-contained, e.g. when converting a syntax subtree to a PluginCleaningData object, you only get the root of the subtree that is expected to be a PluginCleaningData object, so everything that should be part of the resulting object needs to in that subtree. YAML's anchors/aliases work around that by being a YAML feature, and so are put where they're needed when creating the syntax tree. That wouldn't work for the utility name, so I'd have to store some mutable global state externally and reference that instead, which is messy and horrible.
  • I don't like that there's no way to tell that util: '3.2.2' is going to become SSEEdit 3.2.2 (or whatever the value is set to), it's too much "magic" and could cause issues with potential future changes like refactoring metadata, changing the masterlist format to something other than YAML, etc. Migrating metadata from BOSS was often impossible because its masterlist format relied heavily on implicit relationships that weren't represented at a data structure level, and I'd hate to dig that hole again with something like this. I can't think of a problem case, it's more of a gut feeling based on past experience.

The first point about implementation difficulty is enough on its own to scupper this idea for me. The other downsides are more woolly, but there's no good way I can see to actually do what's being asked.

Editing the masterlists

I'm not bothered by this, as mentioned a regex search/replace is easy enough.

xEdit would need an update

I don't like that this would mean tying xEdit to new versions of LOOT, as it means xEdit would produce output that's not backwards-compatible with older versions of LOOT, and trying to use it there would not fail safely: users wouldn't see that something went wrong, they'd just see nonsense warning messages like "v3.2.2 found 12 deleted records", and that's not a great experience.

This is a more general issue than just this, invoked by any metadata change, but it's something to be wary of. Backwards-incompatible metadata syntax changes are painful, and incompatibilities that aren't fail-safe are worse.

Lack of flexibility

This isn't mentioned above, but I consider it a downside. I don't like that this proposal means there can only be one dirtiness checking utility per game. This isn't really a practical concern right now, because there's nothing else like xEdit out there, but I like LOOT to be flexible: for example, I like that LOOT's Bash Tag handling doesn't favour Wrye Bash over Mator Smash. What if a new utility pops up that is more effective at checking for dirty edits? We can't just change the &util definition, because that would invalidate all the existing dirty info, and we can't override it for any new entries that are checked using the new utility - we'd effectively have hardcoded ourselves into having to stick with xEdit.