dtcenter/MET

Enhance MET to calculate weighted contingency table counts and statistics

Closed this issue · 7 comments

Describe the New Feature

Per Discussions #2316, this issue is to create an ability within MET to read ice area data and calculate the contingency table statistics for the symmetric difference area of the two fields. These counts need to be area-weighted (not a count of grid points or grid boxes) of each matched pair.

Users need to be able to set thresholds independently (for forecast and observation datasets) that allows control of the critical level of ice for a given area. Levels of 15/40/80% are of interest.

This issue is dependent on several tripolar reading issues, which will need to be completed before the datasets containing Integrated Ice Extent can be read/computed. Additionally, observation datasets that were discussed may already be in polar stereographic projection. Northern and Southern Hemispheres will be processed independently.

Acceptance Testing

@rgrumbine can provide the appropriate datasets for testing this functionality.

Time Estimate

2 weeks?

Sub-Issues

Consider breaking the new feature down into sub-issues.
Using a single issue but breaking this work down into multiple pull requests:

  • #2967 switches from integer contingency table counts to double precision sums of weights.
    • Update CTS/MCTS/Probabilistic statistics to be computed using doubles instead of integers, which revealed a couple of bugs related to integer division and integer overflow for large contingency table counts:
  • Update Grid-Stat to apply the existing grid_weight_flag option when processing contingency tables (CTC, MCTC, PCT, and derived stats).
  • Update Ensemble-Stat to apply the existing grid_weight_flag option when processing contingency tables (CTC, MCTC, PCT, and derived stats).
  • Decided not to add a new WEIGHT_SUM column to the end of the FHO line type and update Stat-Analysis to use it when parsing FHO data into contingency tables. Instead, updated Grid-Stat to print a warning if FHO output is requested with grid weighting and automatically disable the FHO output.
  • Update the existing unit_grid_weight.xml unit tests to exercise this new functionality.
  • Consider whether the grid weights should be normalized (by dividing by the mean of the weight for all points in the current verification task) prior to being used or whether the raw weights should be used.
    • This comment from @j-opatz confirms that raw weights should be used. Support for normalized area weights could easily be added in the future if needed.
  • Keep on the lookout for integer variable types being used to store contingency table entries throughout the code base. Those should all be changed to doubles. Also carefully inspect the usage of n_pairs() (integer number of matched pairs in the contingency table) versus total() (sum of the weights for those pairs). Generally total() should be used to compute statistics, not n_pairs(). However, n_pairs() IS used to weight the aggregation of results across multiple input lines in Stat-Analysis. Since there's not a great reason to change that, recommend keeping that logic in place.

Relevant Deadlines

List relevant project deadlines here or state NONE.

Funding Source

Define the source of funding and account keys here or state NONE.

Define the Metadata

Assignee

  • Select engineer(s) or no engineer required
  • Select scientist(s) or no scientist required

Labels

  • Review default alert labels
  • Select component(s)
  • Select priority
  • Select requestor(s)

Milestone and Projects

  • Select Milestone as the next official version or Backlog of Development Ideas
  • For the next official version, select the MET-X.Y.Z Development project

Define Related Issue(s)

Consider the impact to the other METplus components.

New Feature Checklist

See the METplus Workflow for details.

  • Complete the issue definition above, including the Time Estimate and Funding source.
  • Fork this repository or create a branch of develop.
    Branch name: feature_<Issue Number>_<Description>
  • Complete the development and test your changes.
  • Add/update log messages for easier debugging.
  • Add/update unit tests.
  • Add/update documentation.
  • Push local changes to GitHub.
  • Submit a pull request to merge into develop.
    Pull request: feature <Issue Number> <Description>
  • Define the pull request metadata, as permissions allow.
    Select: Reviewer(s) and Development issue
    Select: Milestone as the next official version
    Select: MET-X.Y.Z Development project for development toward the next official release
  • Iterate until the reviewer(s) accept and merge your changes.
  • Delete your fork or branch.
  • Close this issue.

On 7/25/24, a meeting was held to discuss specifics of this issue and gather any unknown requirements.

The results of that meeting are collected in the meeting notes.

To accurately reflect what was discussed in the meeting, the contents of this issue will also be updated.

@JohnHalleyGotway and @j-opatz discussed implementation details via Slack on 10/3/24. Here's a summary of why we decided to use the existing grid_weight_flag option rather than adding a new to provide finer configuration control.

From @JohnHalleyGotway:
I’m looking for some advice on implementation details of MET #2887. Grid-Stat has an existing grid_weight_flag config option (see description). I need a way to enable/disable weights being used for contingency table counts and stats.

The simplest choice is using the existing config option. If enabled, weights are applied to continuous stats (existing) and also contingency tables (new). However, that would NOT allow for weighted continuous stats and un-weighted categorical stats… which is what we currently get. In that way it’s not entirely backward compatible. But it sure is simple.

A less simple choice would be adding a new config option specifically for contingency table counts and stats. But IMHO that’s more confusing… but it would enable the changes to be more backward compatible.

From @j-opatz:
If it's advantageous of us to use the existing config options, I'd wager to say that most users who want weighted statistics, be they continuous or categorical, will want them all to be weighted. Even if they wanted one weighted and not the other, that's why we allow instance names in METplus wrappers; users could have one instance where all CTS and CNT output is weighted (via the grid_weight_flag COS_LAT or AREA settings) and another instance where none are. It's more output/run time, but it would allow us to use the existing logic.

Running the full set of unit tests after updating Grid-Stat to apply weights to contingency tables produces many diffs. Here's some updated output, comparing the CTC line to the SL1L2 line for the same data:

V12.0.0 GFS   NA   240000    20120410_000000 20120410_000000 000000   20120410_000000 20120410_000000 WIND      m/s        P850     WIND      m/s       P850    GFSANL FULL          NEAREST     1           >OCDP90     >OCDP90          NA         NA    CTC       18333.60223 1523.51141   643.90538     692.14901 15474.03643      0.5
V12.0.0 GFS   NA   240000    20120410_000000 20120410_000000 000000   20120410_000000 20120410_000000 WIND      m/s        P850     WIND      m/s       P850    GFSANL FULL_BIN_MEAN NEAREST     1           NA          NA               NA         NA    SL1L2     29030          8.16125     8.23317       0         100.47851    102.08706   1.30328

Note that the values in the TOTAL column differ: 29030 != 18333.60223
The SL1L2 TOTAL column reports the number of matched pairs used, while the CTC TOTAL column reports the sum of the weights of the 2x2 cells. I don't like this. It'd be better to the CTC TOTAL column to continue reporting the number of matched pairs... and it'll be just the case that the sums of cells will no longer equal the TOTAL column when non-1.0 weights are used. Recommend updating the contingency table classes accordingly.

Note also that if TOTAL reports the integer number of matched pairs rather than the sum of the weights, then the conversion from FHO to CTC is no longer well-defined for non-default 1.0 weights. Specifically, the sum of the weights is not reported in the existing FHO line type. Consider adding a new WEIGHT_SUM column to the end of the line type so that CTC can be derived from FHO.

Discussion of adding a new column to FHO line type was held in today's METplus wrappers meeting. After hearing the pros and cons, as well as the general reception of the rest of the METplus team that was present, it seems like the best option forward is to create a hard break: if a user desires area-weighted contingency table statistics, FHO is not a valid line type to request.

I am against the idea of creating an error if the user requests FHO while enabling area-weighted stats. Instead, a user should receive an FHO file that is empty, which indicates MET received the request for FHO but also will not fill it with meaningless values. If there is a desire to put a debug message that lets a user know that FHO output files were not filled in due to the use of area-weighted settings, that would be acceptable.

As coordinated over Slack, adding a check to disable FHO output if grid weighting is requested. Here's the corresponding warning message:

WARNING: 
WARNING: GridStatConfInfo::process_config() -> Disabling FHO output that is not compatible with grid weighting. Set "grid_weight_flag = NONE" to write FHO output.
WARNING: 

@j-opatz FYI, I updated the existing tests in unit_grid_weight.xml to request the CTC/CTS/MCTC/MCTS output be requested since it is now impacts by the grid_weight_flag setting. Listed below are CTC counts for the same data but with different grid weighting options applied:

DESC           ... LINE_TYPE TOTAL FY_OY          FY_ON        FN_OY        FN_ON          EC_VALUE
NO_WEIGHT      ... CTC       15113 6163           102          91           8757           0.5
COS_LAT_WEIGHT ... CTC       15113 5439.37053     82.98455     74.59037     5778.8472      0.5
AREA_WEIGHT    ... CTC       15113 16813231.10054 256296.93192 230381.69187 17815176.73687 0.5

Note that the NO_WEIGHT line has un-weighted integer counts. So the weights are a constant value of 1.0.
no_weight

The COS_LAT_WEIGHT line contains sums of weights, where the weights are defined by the absolute value of the cosine of the latitude... so bigger weights at the equator and smaller at the poles.
cos_lat_weight

The AREA_WEIGHT line contains sums of weights, where the weights are defined by the true grid box area in kilometers. The result is very similar to the COS_LAT weighting for a lat/lon grid.
area_weight

I notice that the sums of areas produces very large numbers. And I worry a bit about variable overflow. When multiplying a big number by another big number you get an even bigger number. Double-precision variables do support extremely large numbers (~1.79769e+308). So we should be fine.

I'm wondering if we should leave these numbers as-is. Or should we go through and normalize the weights to put them much closer to a value of 1.0? Is it better to report the weights exactly as-defined? Or is it better to normalize to make them more comparable to the un-weighted counts?

Is it better to report the weights exactly as-defined? Or is it better to normalize to make them more comparable to the un-weighted counts?

This is a great question. While my first reaction was to support normalizing the data to avoid extreme value returns, I'm concerned doing so would create a "firewall" for users who want the actual values. Similar to the FHO line type, I'd like to avoid an outcome where users would benefit more from the raw values even if they become absurdly large over what we give them, which compresses the actual value in the name of cleanliness.