DOI-USGS/dataretrieval-python

`site_info` metadata for queries with `stateCd`, `huc`, or `bBox`

SorooshMani-NOAA opened this issue · 2 comments

Currently only metadata associated with queries that have sites or site_no has a valid site_info function. Is there any reason that this is not extended to all types of queries? e.g. by state code, by HUC or even bounding box?

https://github.com/USGS-python/dataretrieval/blob/2c8ed7d773084e4d426ad8b89ee1510d036393b9/dataretrieval/nwis.py#L847-L852

This is another good point, we should be able to set this metadata from those other queries as well -- will look into this.

For the sake of completeness, here's an example of the problem:

Query data from all sites in Rhode Island:

>>> import dataretrieval.nwis as nwis
>>> df, md = nwis.get_dv(stateCd='RI')
>>> md.site_info()

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/jhariharan/Documents/dataretrieval-1/dataretrieval/nwis.py", line 851, in <lambda>
    md.site_info = lambda: what_sites(sites=parameters[alias])
  File "/Users/jhariharan/Documents/dataretrieval-1/dataretrieval/nwis.py", line 643, in what_sites
    response = query_waterservices(service='site', **kwargs)
  File "/Users/jhariharan/Documents/dataretrieval-1/dataretrieval/nwis.py", line 349, in query_waterservices
    return query(url, payload=kwargs)
  File "/Users/jhariharan/Documents/dataretrieval-1/dataretrieval/utils.py", line 169, in query
    raise ValueError("Bad Request, check that your parameters are correct. URL: {}".format(response.url))
ValueError: Bad Request, check that your parameters are correct. URL: https://waterservices.usgs.gov/nwis/site?format=rdb

Basically you can't get site_info metadata unless your query used sites or site_no, so the below works (building off of code above):

>>> df2, md2 = nwis.get_dv(sites=df.index.levels[0].tolist())
>>> md2.site_info()

(   agency_cd          site_no                                         station_nm site_tp_cd  ...  alt_va  alt_acy_va alt_datum_cd   huc_cd
0       USGS         01106000                 ADAMSVILLE BROOK AT ADAMSVILLE, RI         ST  ...   15.00        1.00       NGVD29  1090002
1       USGS         01109403   TEN MILE R., PAWTUCKET AVE. AT E. PROVIDENCE, RI         ST  ...    4.18        5.00       NAVD88  1090004
2       USGS         01111300                  NIPMUC RIVER NEAR HARRISVILLE, RI         ST  ...  339.22       10.00       NAVD88  1090003
3       USGS         01111400                   CHEPACHET RIVER AT CHEPACHET, RI         ST  ...  355.00        1.00       NGVD29  1090003
4       USGS         01111410  CHEPACHET RIVER WEST OF GAZZA RD AT GAZZAVILLE...         ST  ...  335.00        1.00       NGVD29  1090003
..       ...              ...                                                ...        ...  ...     ...         ...          ...      ...
64      USGS  412918071321001                    RI-SNW    6 SOUTH KINGSTOWN, RI         GW  ...  110.99        0.10       NAVD88  1090005
65      USGS  412932071374302                           RI-RIW  417 RICHMOND, RI         GW  ...  114.67        0.01       NAVD88  1090005
66      USGS  413252071323601                             RI-EXW  554 EXETER, RI         GW  ...  156.03        0.01       NAVD88  1090005
67      USGS  413358071433801                             RI-EXW  475 EXETER, RI         GW  ...  142.06        0.01       NAVD88  1090005
68      USGS  415546071474701                       RI-BUW  395 BURRILLVILLE, RI         GW  ...  574.23        1.00       NAVD88  1100001

[69 rows x 12 columns], <dataretrieval.utils.Metadata object at 0x123dc7fd0>)

Fix should involve modifying the bit of code identified in the first post of this issue to make the creation of the site_info lambda function more flexible to support metadata for additional parameters.