mtgjson/mtgjson3

Remove reminder text from the rules text field

Closed this issue · 7 comments

Reminder text is not rules text, and WOTC is rather inconsistant when it comes to putting it on cards. For example, some equipment have reminder text that explains what equip means while some have none. This is quite frustrating for search engine functionality, since the "text" fields of many cards will contain words that are not actually part of the rules text, and don't appear on all similar cards.

I suggest either removing all reminder text from the "text" field altogether, or having separate "all text" and "rules text" fields, one of which would contain reminder text and one would not.

This is somewhat related to #75, though that one is about ability words. (Ability words are not rules text either, but they on the other hand are quite useful to have in the "text" field.)

fenhl commented

Reminder text is part of Oracle text and should remain in the JSON since it's information that can't easily be derived from any other data. For use cases where only rules text is relevant, reminder text can always be removed using a regex ( ?\(.+?\)). This could of course be done in MTG JSON itself to create a "rulesText" field, but would that be worth the increase in file size?

The issue is that reminder text is just noise, and its inclusion or exclusion is based on arbitrary criteria like whether the card was printed in a core set or an "expert level" expansion. It feels inelegant to me to have that sort of extraneous data in the file.

I won't presume to know all the applications for this data and I'm sure there are some where having the reminder text is useful. In general however I'd hazard a guess that it isn't, for the exact reason that it's inconsistently added to cards and therefore can't be relied on.

Your concern about file size is valid, and in fact removing reminder text from the file altogether would reduce the file size even further. Perhaps allCards.json could have the reminder text gone completely, and allCards-x.json have a separate field to include it?

fenhl commented

Another issue if we want to go down that path is, do we want to normalize card text even further? Lists of keyword abilities without reminder text tend to be in a single line, separated by commas, while reminder text forces them to be split into multiple paragraphs.

mtgjson never really was a gatherer clone because it tried to fix errors.
So making card information easier to read and scrap "unneeded" information sounds not too bad.

Also, there is the originalText field!
That would basically be the "with rules text expansion" version somehow. :)

fenhl commented

Reminder text is certainly not “unneeded”. Let's say you're running a bot that shows card data. You'd want to include reminder text since it can be helpful for players reading the card for the first time, for the same reason it's printed on the card in the first place. Also, remember that reminder text receives errata (e.g. the trample reminder text has been tweaked over the years to make it shorter and easier to understand), so the current Oracle reminder text can not be derived from the "originalText" field.

I use reminder text to automatically derive rule sets using heuristic algorithms. Removing it will significantly impact the usefulness of mtgjson for this application.

That said, so long as there is a set that includes the text, I have no objections to it being removed from other sets.

After reviewing my options, I have decided to leave reminder text in, where applicable. If it's in Gatherer/SF, it will remain in MTGJSON4.