taw/magic-sealed-data

Set and Collector boosters / booster fun

Closed this issue · 31 comments

Would there be any problems with adding "boosters" to the PLIST set, and using those to store which cards are available in any given set booster? We could even store information about relative rarities for universes within cards or rare/common list cards.

Example:

{
  "data": {
    "booster": {
      "ZNR": {
        "boosters": [{"contents": {"znr_list_common": 1}, "weight":3}, {"contents": {"znr_list_rare": 1}, "weight":1], "boostersTotalWeight": 4, "sheets":{
          "znr_list_common": {"cards": {"plist:161": 1, ...}},
          "znr_list_rare": {"cards": {"plist:241": 1, ...}}
        }
      }
    }
  }
}

Information from https://mtg.fandom.com/wiki/The_List/ZNR-SNC and https://www.lethe.xyz/mtg/collation/znr.html

taw commented

From what I understand, PLIST content is not in draft boosters, so it wouldn't go into any existing ones.

The main reason none of this is done is amount of effort required, as well as fairly limited information we have.

I can continue doing draft boosters (and Arena boosters since they're pretty much), and after a few tries I gave up on "booster fun" showcase art.

I'm definitely not going to do set boosters, collector boosters etc. unless someone steps in, and is willing to do the work.

If you're interested, right now the system is just code. Right now it's mostly these two files:

And also some others like:

If people are committed to putting in some work, I guess I could setup some separate project with all booster data, the way I did it for precons.

Data in this repo is completely machine-generated, so it would need to be fixed upstream.

I'm already doing the work for collector/set boosters for a different project so I'll see if it's an easy transfer.

taw commented

Assuming you probably don't want to edit a lot of code files, one way this kind of collaboration would be a repo with structure like this data/m10/draft.yml:

And inside something like:

---
packs:
- rate: 9
  contents:
    foil: 1
    commmon: 9
    uncommon: 3
    rare_mythic: 1
- rate: 31
  contents:
    commmon: 10
    uncommon: 3
    rare_mythic: 1
sheets:
  common:
    query: e:m10 r:common size<=base
    size: 121
  uncommon:
    query: e:m10 r:uncommon size<=base
    size: 60
  rare_mythic:
    - query: e:m10 r:rare size<=base
      size: 53
      rate: 2
    - query: e:m10 r:mythic size<=base
      size: 15
      rate: 1

Then some indexer script would resolve all queries and turn them into machine readable jsons like the ones in this repo.

Unfortunately designing expressive enough system is not straightforward. Even regular foils have convoluted rules, and doing something like BBD this way, ugh.

I'll see what I can do. I'll try with ONE first to get the process down, and then work from there.

Should I close this issue here and reopen it in the magic-search-engine repo?

taw commented

It's fine to keep the issue here now.

We'll probably need to iterate the experiment a few times before we figure out something that works.

Here's my first go at a file for the ONE set boosters.
ONE_SET.txt

The things that need development:

I need to come up with a way to effectively search for "the normal printing of a card that has a showcase treatment" and "the normal printing of a card that does not have a showcase treatment". I handled this in my other code by downloading all of the treatments of a card into one object, and treating it differently depending on how many versions it had. In the file above I just included "has:showcase" and "-has:showcase" in the query.

Getting specific cards from the List. I can manually grab the spreadsheet from https://mtg.fandom.com/wiki/The_List/ and save it as a .csv file, but that might be more manual than you were hoping.

There are also two ways to handle the wildcard slots. I built them as one big sheet, but that's not really how they work. It would be more accurate to use three separate sheets for the common/uncommon/rare_mythic wildcards, and have lots of possible pack configurations to match. Let me know what you think the best solution there is.

taw commented

This looks really complicated, and it's only half done.

If we go forward with it, I'd probably need to code a bunch of simple sheets first before we even get to this kind of complexity.

A few techniques I'm already using:

Some actionable things:

  • whatever happens, I should probably implement is:showcase. It needs manually specifying showcase card number ranges for each set.
  • and then has:showcase, as it can be automatically calculated from is:showcase (a card matches has:showcase if there's a card with same name in the same set that matches is:showcase)
  • Even if it's not actually going anywhere yet, maybe we should create machine readable files like data/print_sheets/plist/one.txt with list of all PLIST cards in ONE (and if more complicated, sheets and multiplies like C1 U2 etc.). Do you want to do this?
  • There's already in:booster queries, but maybe I could add queries like booster:one or booster:one-set. This would mostly be useful for cards in one set that appear in another set's boosters, and not directly helpful for this.

It sort of looks like a followup for booster:one could be something like sheet:plist/one-c but this doesn't quite work as queries don't have concept of multiples.

Do you want to give this 4 step plan a go (3 for me, and 1 for you), and then we can get back to think about next steps?

Most of the complexity is identifying which cards are and are not showcase, so having is:showcase and has:showcase would be a huge boost. Maybe I could add tagged ranges to the top of the file? That would let us reuse the same functions for showcase, borderless, extended-art, etched, etc. Let me know if you want any help implementing this.

If it's any use, both Scryfall and MTG.WTF accept frame:showcase as valid search parameters.

I can start building a plist machine readable file. That shouldn't be too difficult from what I already have so I can get that built pretty quickly.

I don't know that booster:one is super useful. Most of the other set cards are from the commander set and jumpstart rares, which don't show up in draft boosters. But the showcase mythic praetors that show up in draft boosters are a one-time thing, which we probably don't need to develop a special function for.

taw commented

Actually we already have is:showcase based on mtgjson data, so there's no extra work here.

taw commented

csv for The List: PLIST_SLX_DISTR.csv

Now that you have is:showcase and has:showcase, you can test it on the normal draft boosters to see if it works. Here's the code for the ONE common and uncommon sheets:

  common:
    - query: e:one r:common is:showcase
      size: 6
      rate: 1
    - query: e:one r:common size<=base has:showcase
      size: 6
      rate: 2
    - query: e:one r:common size<=base -has:showcase
      size: 95
      rate: 3
  uncommon:
    - query: e:one r:uncommon is:showcase
      size: 7
      rate: 1
    - query: e:one r:uncommon size<=base has:showcase
      size: 7
      rate: 2
    - query: e:one r:uncommon size<=base -has:showcase
      size: 53
      rate: 3

I was able to get the queries working if I added -is:foilonly to the showcase searches. Because the step and compleat foils and oil slick foils are a different card number it was screwing with the uncommon searches. Removing the "foil only" versions cleans up most of the problems.

  common:
    - query: e:one r:common is:showcase -is:foilonly ++
      size: 6
      rate: 1
    - query: e:one r:common size<=base has:showcase
      size: 6
      rate: 2
    - query: e:one r:common size<=base -has:showcase
      size: 95
      rate: 3
  uncommon:
    - query: e:one r:uncommon is:showcase -is:foilonly ++
      size: 7
      rate: 1
    - query: e:one r:uncommon size<=base has:showcase
      size: 7
      rate: 2
    - query: e:one r:uncommon size<=base -has:showcase
      size: 53
      rate: 3

I'm finding that potentially a "boosterfun" tag is possible using this setup:

is:boosterfun == (-is:foilonly (is:showcase or is:borderless))
e:set has:boosterfun == e:set alt:(e:set -is:foilonly (is:showcase or is:borderless))

It looks like this is reliably giving me only the versions of cards that can appear as booster fun treatments. I checked for each set back to ZNR and the only exception is that for DMU/KHM/NEO/SNC it's giving me the new prototype praetors alongside the other treatments. Any idea how we could account for them?

Added is:boosterfun and has:boosterfun to the mtg-search-engine repo: taw/magic-search-engine#163

taw commented

I've been trying to come with some complete YAML DSL for this.

Here are some examples.

I don't even know how this would be implemented, I'm just trying to come up with something that's human readable, unambiguous for machine, not too repetitive, and flexible enough to deal with difficult cases as well.

Some notes:

  • set_code.yaml or set_code-subtype.yaml determine set code, so no need to say it, it's draft by default
  • sheets are all-foil or all-nonfoil, that's how data export for mtgjson works, and it's true for all traditional boosters (set boosters will need to work around it; if it's too hard, then i guess code in indexer would need to work around it)
  • query is scoped to e:<set> is:nonfoil is:booster by default, or e:<set> is:foil is:booster if foil: true, rawquery if not true
  • code: "X" or code:"set/X" if there's explicit list somewhere, with or without multiplies
  • I already don't like complexity of m10's foil
  • queries should generally have count for validation, at least the more complicated ones, not included yet

This format is just the first idea. It's not terrible, but possibly we could do better. I'm not particularly attached to yaml, but it probably beats json for this, as deeply nested JSON is annoying for humans to edit, and I'm not sure what would be other commonly known format.

I think yaml will work fine, especially if we're diligent about using comments where things get complicated.

Is there a way to change the set scope? Even for bonus sheets you need to include a separate set code.

Set boosters and collector boosters should still follow the rules for each slot being only foil or non-foil, but there may be some exceptions. Worst case we can use two sheets and include the variation in the pack configuration.

Should I redo the The List CSV file in a different format for your set/code notation? Do you have a sample of that format or a link to an existing code sheet I can base it on?

taw commented

I started implementing this. This is the main issue taw/magic-search-engine#164

taw commented

So looking at PLIST_SLX_DISTR.csv, and how it would convert into something the system can read:

  • MH2 wouldn't be valid sheet identifiers, as they follow old convention that U2 means "on sheet U, 2 times", the rest are fine. This is a bit unfortunate, but something like "MHTWO" can work for now as sheet code.
  • there's no way to mix PLIST and SLX on a print sheet with single code: X, they'd need to be assigned separately and then mixed in set-collector.yaml or such
  • I haven't tested DFCs with the code, they'll probably work

The format would be something like (space alignment optional and just for reading convenience):

Iterative Analysis           1         ZNR KHM STX
...
Brainstorm                  47         ZNR
...
taw commented

I made card numbers optional if they're unambiguous.

From SIS sheet:

Vampiric Fury                         A
Bump in the Night                     B
Cackling Counterpart                  B

Updated with new formatting and loaded to pull request here: taw/magic-search-engine#169

I included the numbering anyway, because there are at least 2 instances of duplicate cards (Lightning Bolt is both 142 and 429, and Rout is both 823 and 970) and this future-proofs against more duplicates being added.

Double faced cards are working perfectly but need the "a" after the number to ensure the right face is chosen.

Set boosters to do:

  • ZNR
  • KHM
  • STX
  • STX-jp
  • MH2
  • AFR
  • MID
  • VOW
  • NEO
  • SNC
  • CLB
  • DMU
  • BRO
  • ONE
  • MOM

Collector boosters to do:

  • ELD
  • THB
  • IKO
  • IKO-jp
  • M21
  • ZNR
  • 2XM (vip pack)
  • CMR
  • KHM
  • STX
  • MH2
  • AFR
  • MID
  • VOW
  • NEO
  • SNC
  • CLB
  • 2X2
  • DMU
  • UNF
  • BRO
  • ONE
  • DMR
  • MOM

Draft boosters to do (check boosterfun):

  • WAR-jp
  • M20
  • ELD
  • THB
  • IKO
  • IKO-jp
  • M21
  • ZNR
  • CMR
  • 2XM
  • KHM
  • TSR
  • STX
  • STX-jp
  • MH2
  • AFR
  • MID
  • VOW
  • NEO
  • SNC
  • CLB
  • 2X2
  • DMU
  • UNF
  • BRO
  • 30A
  • ONE
  • DMR
  • MOM

@taw As I build these do you want met to make pull requests for each set or try and batch them into bigger commits?

taw commented

As long as PRs are yaml only and they build correctly (and you can verify it with running rails locally or open_pack command), just do them whichever way you want and I can merge them right away. If you want to fix the yamls later, that's also zero problems.

The only things I'd really review more closely would be code. It's also probably best to wait a few days for MOM, there's a lot of bad data there right now.

Yeah I'm starting with the oldest sets and working forward to give MOM data some time to settle.

Some cursory research on foil rates for set boosters:

https://www.reddit.com/r/magicTCG/comments/lge567/kaldheim_2_set_booster_stats_and_reflections/
https://www.reddit.com/r/mtgfinance/comments/ljbbb6/what_i_got_in_a_kaldheim_set_booster_box/
https://www.reddit.com/r/magicTCG/comments/ip70yj/set_booster_vs_draft_booster_expected_contents/

Out of 4 boxes (120 packs) there were 14 foil rares/mythics. Two boxes didn't report foil commons/uncommons, but the other two boxes had 42 foil commons, and 12 foil uncommons. For the standard 12/5/3 distribution we would expect to see 18 foil rares, 15 foil uncommons, and 36 foil commons for the same number of packs. For the dedicated 10/3/1 distribution, we would expect 8.5 foil rares, 13 foil uncommons, and 42.5 foil commons. The chi-square of the 12/5/3 distribution is 2.889 with a p-value of 0.2359. The chi-square of the 10/3/1 distribution is 5.300 with a p-value of 0.0707. Based on this, I'm going to continue using the 12/5/3 distribution for set boosters until I find more data to indicate one way or the other.

Known things to fix:

  • MH2 mythics with multiple showcase treatments should have rarity adjusted accordingly
  • Bladed ambassador and elesh norn rarity wrong for one-collector/compleat_foil
  • Set booster wildcard slot doesn't have showcase common/uncommon cards
  • War of the Spark Japanese booster foil sheets have anime planeswalkers at 2x rate
  • IKO collector has arena only version of Void Beckoner
  • Add The List to clb-set
taw commented

I recommend checking this out for extra validation.

There isn't any guarantee that card proportions in Arena pack and nonfoil draft pack will be identical, but this is a good baseline, and differences are suggestive of some mistakes on our side.

It only checks draft and Arena booster, so if there's a mistake in draft booster that also propagates to set/collector boosterls, it would need to be checked manually.

In particular tbh.yaml and iko.yaml look wrong to me, R:M rates are 1.62%:0.95%, which really isn't what they're typically are.

With taw/magic-search-engine#214 that should be all of the boosters done! Now to troubleshoot and do cleanup.

With taw/magic-search-engine#215 all of the known errors should be resolved. I'm going to close this issue, and new issues can be opened for specific errors or adjustments as needed.