pnuu/fmiopendata

download_stored_query & timeseries - KeyError

Closed this issue · 3 comments

Hi, you seem to have a typo in your manual in the following part:

"It is also possible to collect the data to a structure more usable for timeseries analysis by adding "timeseries=True" to the arguments:

from fmiopendata.wfs import download_stored_query

obs = download_stored_query("fmi::observations::weather::multipointcoverage",
args=["bbox=25,60,25.5,60.5",
"timeseries=True"])"

Here, "timeseries" should be "Timeseries" with a capital T.
If I write"timeseries" to my code as following:

import datetime as dt
from fmiopendata.wfs import download_stored_query

end_time = dt.datetime.utcnow()
start_time = end_time - dt.timedelta(days=2)
start_time = start_time.isoformat(timespec="seconds") + "Z"
end_time = end_time.isoformat(timespec="seconds") + "Z"

obs = download_stored_query("fmi::observations::weather::multipointcoverage",
args=["bbox=24.88,60.14,25.01,60.19",
"timeseries=True"])

latest_tstep = max(obs.data.keys())
print(sorted(obs.data[latest_tstep].keys()))
print(sorted(obs.data[latest_tstep]['Helsinki Kaisaniemi'].keys()))

it throws me the following error: print(sorted(obs.data[latest_tstep]['Helsinki Kaisaniemi'].keys())) KeyError: 'Helsinki Kaisaniemi'

pnuu commented

The library works as expected, but you are accessing the data in the non-timeseries way 😉 As the documentation says, with timeseries=True the data are organized by location:

print(obs).data.keys())
# -> dict_keys(['Helsinki Kaisaniemi'])
print(obs.data["Helsinki Kaisaniemi"].keys())
# -> dict_keys(['times', 't2m', 'ws_10min', 'wg_10min', 'wd_10min', 'rh', 'td', 'r_1h', 'ri_10min', 'snow_aws', 'p_sea', 'vis', 'n_man', 'wawa'])
print(obs.data["Helsinki Kaisaniemi"]["t2m"].keys())                   
# -> dict_keys(['values', 'unit'])
print(obs.data["Helsinki Kaisaniemi"]["t2m"]["values"][0])             
# -> 7.7

When you use the capitalized version, you are getting the same version as you'd get withtimeseries=False, that is, not a timeseries version.

Thank you very much @pnuu! I got it to work somehow now.

I'm very sorry if this is not the right place to ask for further instructions, but I would need help in understanding why my script does not return me enough data. I would like to download historical weather data for the past 10 years or so, but I'm unable to download even a full day of data with my current script:

print(dt.datetime.utcnow())
print(dt.timedelta(hours=9))
end_time = dt.datetime.utcnow()
start_time = end_time - dt.timedelta(hours=24)
# Convert times to properly formatted strings
start_time = start_time.isoformat(timespec="seconds") + "Z"
end_time = end_time.isoformat(timespec="seconds") + "Z"
print("Start time:", start_time, "\nEnd time:", end_time)

obs = download_stored_query("fmi::observations::weather::multipointcoverage",
args=["bbox=24.87,60.14,25.01,60.19", "bbox=24.88,60.14,25.01,60.19"
"timeseries=True"])

print("SORTED OBS DATA KEYS:", sorted(obs.data.keys()))
print(obs.data["Helsinki Kaisaniemi"].keys())
print(obs.data["Helsinki Kaisaniemi"]["t2m"].keys()) # t2m = temperature
print(obs.data["Helsinki Kaisaniemi"]["t2m"]["values"])
print(len((obs.data["Helsinki Kaisaniemi"]["times"])))
for i in range(0, len((obs.data["Helsinki Kaisaniemi"]["times"]))):
print(obs.data["Helsinki Kaisaniemi"]["times"][i])

I have tried to alter the start_time = end_time - dt.timedelta(hours=24) "hours" to 48 or 100 or 200 but no matter what I enter, this script always returns me 72 datapoints (len((obs.data["Helsinki Kaisaniemi"]["times"]) returns 72).
What should I do in order to download more data points, so that I would be able to download for example a full month of Kaisaniemi weather data?

pnuu commented

I would need help in understanding why my script does not return me enough data. I would like to download historical weather data for the past 10 years or so, but I'm unable to download even a full day of data with my current script:

end_time = dt.datetime.utcnow()
start_time = end_time - dt.timedelta(hours=24)
# Convert times to properly formatted strings
start_time = start_time.isoformat(timespec="seconds") + "Z"
end_time = end_time.isoformat(timespec="seconds") + "Z"

obs = download_stored_query("fmi::observations::weather::multipointcoverage",
args=["bbox=24.87,60.14,25.01,60.19", "bbox=24.88,60.14,25.01,60.19"
"timeseries=True"])

What should I do in order to download more data points, so that I would be able to download for example a full month of Kaisaniemi weather data?

You haven't used the start_time and end_time in the query. and to be sure to get only data from Kaisaniemi, use the place parameter (see FMI Open Data manual for the syntax used below). The call should be:

obs = download_stored_query("fmi::observations::weather::multipointcoverage", args=["place=kaisaniemi,helsinki", "timeseries=True", "starttime=" + start_time, "endtime=" + end_time])

As a general note, I suggest not to download 10 years of data in one go, but split it in for example 1 month chunks. That way possible network errors won't ruin the whole query, and more importantly, it will be easier on the servers at FMI.