dtcenter/MET

Add new wind direction verification statistics for RMSE, Bias, and MAE

Closed this issue · 12 comments

Describe the New Feature

Wind direction verification is different from vector wind and wind speed verification. The specificity of wind direction verification is that wind direction and wind direction difference between forecast and observation need to be adjusted to (-180, +180] degree. The current VCNT and VL1L2 outputs don't support wind direction verifications of RMSE, Bias and MAE.

So here is the request list of new outputs:

  1. N: the number of grid points

  2. MSE (mean squared value of wind direction difference between the forecast and observed wind, this difference needs to be adjusted to (-180, +180] degree)
    DIR2_BAR=sum [(DIRf-DIRo)**2]/N, where DIRf-DIRo needs to be recalculated to (-180, +180], DIRf is forecast wind direction and DIRo is observation wind direction.

  3. Mean Error (mean value of wind direction difference between the forecast and observed wind, this difference needs to be adjusted to (-180, +180] degree)
    DIR_BAR=sum(DIRf-DIRo)/N, where DIRf-DIRo needs to be recalculated to (-180, +180]

  4. MAE (mean absolute value of wind direction difference between the forecast and observed wind, this difference needs to be adjusted to (-180, +180] degree)
    DIR_ABSBAR=sum[abs(DIRf-DIRo)]/N, where DIRf-DIRo needs to be recalculated to (-180, +180]

Meanwhile, the outputs will allow wind speed threshold settings.

Acceptance Testing

List input data types and sources.
Describe tests required for new functionality.

Time Estimate

2 days.

Sub-Issues

Consider breaking the new feature down into sub-issues.
None needed.

Relevant Deadlines

List relevant project deadlines here or state NONE.

Funding Source

Define the source of funding and account keys here or state NONE.

Define the Metadata

Assignee

  • Select engineer(s) or no engineer required
  • Select scientist(s) or no scientist required

Labels

  • Select component(s)
  • Select priority
  • Select requestor(s)

Projects and Milestone

  • Select Repository and/or Organization level Project(s) or add alert: NEED PROJECT ASSIGNMENT label
  • Select Milestone as the next official version or Future Versions

Define Related Issue(s)

Consider the impact to the other METplus components.

New Feature Checklist

See the METplus Workflow for details.

  • Complete the issue definition above, including the Time Estimate and Funding source.
  • Fork this repository or create a branch of develop.
    Branch name: feature_<Issue Number>_<Description>
  • Complete the development and test your changes.
  • Add/update log messages for easier debugging.
  • Add/update unit tests.
  • Add/update documentation.
  • Push local changes to GitHub.
  • Submit a pull request to merge into develop.
    Pull request: feature <Issue Number> <Description>
  • Define the pull request metadata, as permissions allow.
    Select: Reviewer(s) and Development issues
    Select: Repository level development cycle Project for the next official release
    Select: Milestone as the next official version
  • Iterate until the reviewer(s) accept and merge your changes.
  • Delete your fork or branch.
  • Close this issue.

@malloryprow and @YaliMao-NOAA, I'm working on this issue and have a question about the degenerate case.

First, I'm adding this to the existing VL1L2Info class and plan to write the resulting stats to the VCNT continuous vector statistics line type. The TOTAL column in that line indicates the number of points used to compute the statistics in it. If any of the 4 forecast or observation U or V wind component values is bad data, then that point will be excluded from these stats. But if the forecast U and V or the observation U and V are both 0, then the direction is not defined for that vector. And MET sets it to bad data... and so the forecast - observation direction difference is also bad data.

The question is what do to in this case? Practically speaking this shouldn't come up that often because you should configure MET to compute vector stats using some wind speed threshold... meaning that we'll only evaluate vectors whose direction is well-defined.

But we do need to decide how the MET software should handle this.

Options include:

  1. Just skip over that point and don't include the direction diff in the running sum. Given how weights are currently computed, that's mathematically equivalent to saying the direction difference is 0 when either or both directions are not defined. Optionally, I could keep track of how often this occurs and write a log message with that info.

  2. Keep separate counts for the valid U/V pairs and valid wind direction differences. This would exclude that point from the statistics, but require more bookkeeping. And there's currently no output column to indicate total number of U/V pairs vs total number wind direction differences. So while the stats would be more accurate, we wouldn't actually be writing the count... unless we add it.

Just wanted to lay out these considerations to see if you can provide any direction on them.

@JohnHalleyGotway,

@YaliMao-NOAA has code the currently does wind direction statistics so hopefully she can provide more insight into what her code does in this situation.

@JohnHalleyGotway What statistics are computed for these data? Is it using directional and/or circular statistics (e.g., von Mises distributions)? I don't see why you can't have zero unless the statistics are undefined there (division or log of zero). I would think the information about u and v being zero in both forecast and obs (or only one) could be useful.

@JohnHalleyGotway
Option 1 looks more practically to me but I have one question. For the skipped points that wind direction is not defined (wind speed is zero), will they be skipped if a wind speed threshold is applied?

@YaliMao-NOAA, no, if a wind speed threshold is used, then those 0-vectors will be filtered out and the wind directions will be well-defined. So this is easily addressed with a configuration file option.

But there are specific details to consider...

  • If you apply the wind speed threshold to the forecast AND observation vectors, this should never happen because both will have speed > 0.
  • If you apply the wind speed threshold to only the observation vectors (or only the forecast vectors), then you could still encounter 0-vectors with undefined direction.

Based on recent issues raised on WCOSS2, I'll write a DEBUG level 3 log message about this rather than a WARNING message which would raise alarm bells.

If we want to be able to aggregate these direction stats across multiple runs, I should also add new columns to the VL1L2 vector partial sums output line, so that we can aggregated VL1L2 and then derived VCNT stats from the aggregated lines. I'll plan on doing that as well.

@JohnHalleyGotway Perhaps another way to go would be to have a default to a threshold (for both) that is just a little more than zero along with a note in the config file explaining why you want to use a threshold?

@JohnHalleyGotway Perhaps another way to go would be to have a default to a threshold (for both) that is just a little more than zero along with a note in the config file explaining why you want to use a threshold?

That's what I am thinking about too. The default threshold should be applied to both forecast and observation. WAFS wind direction uses wind speed threshold 10knots=5.14444m/s which may be too big as a default non-zero value for the general verification.

@YaliMao-NOAA and @malloryprow, the current default in Point-Stat for the wind speed threshold is NA:

wind_thresh = [ NA ];

Setting a threshold to NA always evaluates to true. So, by default, no U/V matched pairs are excluded based on their wind speed.

I've thought carefully about this, and do not want to change this default setting. Doing so would cause a change in the output for any users relying on that default setting. And we try to avoid that as much as possible. Instead, we want newer versions of the code to produce the same output as earlier versions... unless there's a bugfix that intentionally changes the output.

The current version in my feature branch counts up the number of zero vectors for which direction is not defined. And it prints a warning message like this:

DEBUG 2: Processing VGRD/Z10 versus VGRD/Z10, for observation type ADPSFC, over region DTC165, for interpolation method NEAREST(1), using 934 matched pairs.
DEBUG 2: Computing Categorical Statistics.
DEBUG 2: Computing Scalar Partial Sums and Continuous Statistics.
DEBUG 2: Computing Vector Partial Sums and Continuous Vector Statistics.
WARNING: 
WARNING: VL1L2Info::compute_stats() -> Skipping 249 of 934 vector pairs for which the direction difference is undefined.
WARNING: Set the "wind_thresh" and "wind_logic" configuration options to exclude zero vectors.
WARNING: 

This alerts the user to the presence of zero vectors and recommends what to do.
But I'm wondering about the logging level. Since NCO is so averse to the presence of WARNING messages, should I write this as a DEBUG(3) log message instead of WARNING?

I do note that generally it really is fine to write VL1L2 vector partial sums without wind_thresh being set. The zero vectors are only a problem for these new wind direction stats. All of the other existing partial sums aggregate zero vectors just fine. So I'm torn as to whether or not this should be a WARNING message.

Thoughts?

I think WARNING is fine. It is alerting us to something we can change in the configuration, and thus would get rid of the WARNING.

Reopening this issue and reassigning to the MET-12.0.0 beta5 development cycle based on discussion dtcenter/METplus#2590.

Recommend adding the following changes:

  1. Update the VL1L2Info class to track the number of U/V pairs for which both the forecast and observation vectors are non-zero, which is required for wind direction to be well-defined.
  2. Update the VL1L2, VAL1L2, and VCNT line types to report that count in a new TOTAL_DIR column.
  3. Update the logic of Stat-Analysis to parse this new TOTAL_DIR column and use it to aggregate wind direction statistics rather than the existing TOTAL column.
  4. Coordinate with METdataio and METcaclpy to update the parsing and aggregation logic there.

@JohnHalleyGotway Can this wind direction MET issue be marked as required for the official release? Thanks!

Release Acceptance Testing Summary

Version: MET-12.0.0-beta5
Date: 9/30/2024
Location: WCOSS2
Status (PASS/FAIL): PASS
Description: I see the extra column in the line types. Also see lines were TOTAL_DIR differs from TOTAL. The WARNING messages are gone too.