dtcenter/METplus

Enhance ASCII2NC wrapper as needed to support the development of an ISMN use case

Opened this issue · 3 comments

Describe the Enhancement

Pull request dtcenter/MET#2758 for issue dtcenter/MET#2701 adds support to ascii2nc for handling the -format ismn option for reading soil moisture data (and other variable types too) for the ISMN Network. This issue is to update the METplus ASCII2NC wrapper to handle this new format option.

I'll note that a new use case should also be added to demonstrate the actual use of this ISMN data for verification, but that issue does not yet exist. Recommend that @anewman89 add it now or once the ASCII2NC wrapper has been updated.

Sample ISMN data used by the MET unit tests can be found here:
https://dtcenter.ucar.edu/dfiles/code/METplus/MET/MET_unit_test/unit_test/obs_data/ismn/

The data is stored in separate files for each site, instrument, and variable! Each file contains data for multiple times. In fact, in the full data archive (at https://ismn.earth/en/dataviewer/api/download_archive) each file contains decades of data. Please coordinate with @anewman89 to determine the workflow for ingesting this data into METplus.

Time Estimate

Estimate the amount of work required here.
Issues should represent approximately 1 to 3 days of work.

Sub-Issues

Consider breaking the enhancement down into sub-issues.
None needed.

Relevant Deadlines

List relevant project deadlines here or state NONE.

Funding Source

Define the source of funding and account keys here or state NONE.
This is part of the METplus Land projects and has been added to that project board.

Define the Metadata

Assignee

  • Select engineer(s) or no engineer required
  • Select scientist(s) or no scientist required

Labels

  • Select component(s)
  • Select priority
  • Select requestor(s)

Projects and Milestone

  • Select Repository and/or Organization level Project(s) or add alert: NEED CYCLE ASSIGNMENT label
  • Select Milestone as the next official version or Future Versions

Define Related Issue(s)

Consider the impact to the other METplus components.

Enhancement Checklist

See the METplus Workflow for details.

  • Complete the issue definition above, including the Time Estimate and Funding Source.
  • Fork this repository or create a branch of develop.
    Branch name: feature_<Issue Number>_<Description>
  • Complete the development and test your changes.
  • Add/update log messages for easier debugging.
  • Add/update unit tests.
  • Add/update documentation.
  • Add any new Python packages to the METplus Components Python Requirements table.
  • Push local changes to GitHub.
  • Submit a pull request to merge into develop.
    Pull request: feature <Issue Number> <Description>
  • Define the pull request metadata, as permissions allow.
    Select: Reviewer(s) and Development issues
    Select: Repository level development cycle Project for the next official release
    Select: Milestone as the next official version
  • Iterate until the reviewer(s) accept and merge your changes.
  • Delete your fork or branch.
  • Close this issue.

@JohnHalleyGotway, if the only work needed for this issue is to allow setting -format ismn via the wrapper config, then I don't think any code changes are needed. A user can set this by setting:

ASCII2NC_INPUT_FORMAT = ismn

There is no error checking in the wrapper for the value set there. Are there any specific changes you can think of that would be needed besides this?

Great! Glad to hear that no additional mods are needed to support the setting of -format ismn in the call to ASCII2NC.

I'd say the major challenge in working with this dataset is how many input files are involved. The sample ISMN data I've added for the MET unit tests are a tiny fraction of the full dataset. For each of the ~24,365 sensors in the entire lifetime of the network, there is a single file containing the full history of reports for that sensor. From the ISMN website, users can query a spatial and/or temporal subset or pull the complete archive up to the current day.

I'm curious to see how @anewman89 sets up a METplus use case to provide the list of input files. He may want to provide a top-level directory and search recursively through it for any file ending in *.stm but I'm really not sure. I think it all depends on how the ISMN data is retrieved and organized. I'm guessing that they'll want to filter the ISMN obs by time and/or space. Currently spatial masking is supported using -mask_grid, -mask_poly, and -mask_sid command line options, and there is no way to filter by time. dtcenter/MET#2654 mentions enhancing ASCII2NC to filter by time, and I'm guessing that'll be useful for ISMN data as well. But we'll see what @anewman89 decides is needed.

There is a potential that changes to the wrapper will be needed based on the use case development. But it sounds like no changes are needed based on the enhancement to the ASCII2NC tool itself.

As discussed during the METplus Land Project meeting on 4/15/24, the -valid_beg and -valid_end command line options being added to the METplus ASCII2NC wrapper via #2547 may also be useful when processing the ISMN data. Just documenting that dependency here as filtering the ISMN data by time may be useful when developing the use case.