DOI-USGS/dataretrieval-python

Equivalent to whatNWISdata from R version

ArlexMR opened this issue · 2 comments

I need to filter the sites with a minimum number of records of predefined parameters. Is there any function to get the number of records given the site and parameter codes? I think it is something like 'whatNWISdata' from the R package.

Yes, I believe the dataretrieval.nwis.get_info() function is what you are looking for.

As an example, if we want to get information about site "05114000" and we use the R whatNWISdata function, that would look like the following:

data <- dataRetrieval::whatNWISdata(siteNumber='05114000')

The returned data list has the following column names:

> colnames(data)
 [1] "agency_cd"          "site_no"            "station_nm"         "site_tp_cd"        
 [5] "dec_lat_va"         "dec_long_va"        "coord_acy_cd"       "dec_coord_datum_cd"
 [9] "alt_va"             "alt_acy_va"         "alt_datum_cd"       "huc_cd"            
[13] "data_type_cd"       "parm_cd"            "stat_cd"            "ts_id"             
[17] "loc_web_ds"         "medium_grp_cd"      "parm_grp_cd"        "srs_id"            
[21] "access_cd"          "begin_date"         "end_date"           "count_nu"  

Using Python, we can get the same result using the get_info() function:

from dataretrieval import nwis
df, md = nwis.get_info(sites='05114000', seriesCatalogOutput=True)

Note that we have to specify seriesCatalogOutput=True, this is something the R package does automatically, here the user has control over that argument (and the default behavior is False).

If we list the columns in the returned df data frame, they should match what we got using R:

>>> df.columns
Index(['agency_cd', 'site_no', 'station_nm', 'site_tp_cd', 'dec_lat_va',
       'dec_long_va', 'coord_acy_cd', 'dec_coord_datum_cd', 'alt_va',
       'alt_acy_va', 'alt_datum_cd', 'huc_cd', 'data_type_cd', 'parm_cd',
       'stat_cd', 'ts_id', 'loc_web_ds', 'medium_grp_cd', 'parm_grp_cd',
       'srs_id', 'access_cd', 'begin_date', 'end_date', 'count_nu'],
      dtype='object')

Hope that helps.

Thanks! That's just what I was looking for