tsutterley/pyTMD

Issues modelling tides with JSON definition files

Closed this issue ยท 5 comments

Hey @tsutterley, I've been trying to test out the new GOT5.5/5.6 using the JSON definition files included in tests. My directory looks like this:

image

I can run tide modelling for other models (e.g. FES2012) from a .def format definition file pretty easily, e.g.:

from pyTMD.compute import tide_elevations
import pandas as pd
import numpy as np

out = tide_elevations(
    x=[122.14],
    y=[-17.9],
    delta_time=pd.date_range("2020", "2021", periods=2),
    DIRECTORY="/gdata1/data/tide_models/",
    MODEL="FES2012",
    DEFINITION_FILE="/gdata1/data/tide_models/model_FES2012.def",
    DEFINITION_FORMAT="ascii",
    EPSG=4326,
    TIME="datetime",
    EXTRAPOLATE=True,
    CUTOFF=np.inf,
)

However, if I try to do something similar with a JSON fromat definition, I get an TypeError: intern() argument must be str, not list error:

out = tide_elevations(
    x=[122.14],
    y=[-17.9],
    delta_time=pd.date_range("2020", "2021", periods=2),
    DIRECTORY="/gdata1/data/tide_models/",
    MODEL="GOT5.6",
    DEFINITION_FILE="/gdata1/data/tide_models/model_GOT5.6.json",
    DEFINITION_FORMAT="json",
    EPSG=4326,
    TIME="datetime",
    EXTRAPOLATE=True,
    CUTOFF=np.inf,
)

Is there anything I'm doing obviously wrong here? I wasn't exactly sure what additional params need to be provided when specifying JSON definition files, beyond DEFINITION_FORMAT="json"...

Full error --------------------------------------------------------------------------- TypeError Traceback (most recent call last) Cell In[23], line 40 18 import numpy as np 20 # out = tide_elevations( 21 # x=[122.14], 22 # y=[-17.9], (...) 36 37 # out ---> 40 out = tide_elevations( 41 x=[122.14], 42 y=[-17.9], 43 delta_time=pd.date_range("2020", "2021", periods=2), 44 DIRECTORY="/gdata1/data/tide_models/", 45 MODEL="GOT5.6", 46 DEFINITION_FILE="/gdata1/data/tide_models/model_GOT5.6.json", 47 DEFINITION_FORMAT="json", 48 EPSG=4326, 49 TIME="datetime", 50 EXTRAPOLATE=True, 51 CUTOFF=np.inf, 52 ) 54 out 56 # # from pyTMD.compute_tide_corrections import compute_tide_corrections 57 58 # # out = compute_tide_corrections( (...) 68 69 # # out

File /env/lib/python3.10/site-packages/pyTMD/compute.py:299, in tide_elevations(x, y, delta_time, DIRECTORY, MODEL, ATLAS_FORMAT, GZIP, DEFINITION_FILE, DEFINITION_FORMAT, CROP, BOUNDS, EPSG, EPOCH, TYPE, TIME, METHOD, EXTRAPOLATE, CUTOFF, INFER_MINOR, APPLY_FLEXURE, FILL_VALUE, **kwargs)
297 # get parameters for tide model
298 if DEFINITION_FILE is not None:
--> 299 model = pyTMD.io.model(DIRECTORY).from_file(DEFINITION_FILE,
300 format=DEFINITION_FORMAT)
301 else:
302 model = pyTMD.io.model(DIRECTORY, format=ATLAS_FORMAT,
303 compressed=GZIP).elevation(MODEL)

File /env/lib/python3.10/site-packages/pyTMD/io/model.py:1408, in model.from_file(self, definition_file, format)
1406 self.from_ascii(fid)
1407 elif (format.lower() == 'json'):
-> 1408 self.from_json(fid)
1409 # close the definition file
1410 fid.close()

File /env/lib/python3.10/site-packages/pyTMD/io/model.py:1734, in model.from_json(self, fid)
1730 elif (temp.type == 'z') and (temp.directory is not None):
1731 # use glob strings to find files in directory
1732 glob_string = copy.copy(temp.model_file)
-> 1734 temp.model_file = list(temp.directory.glob(glob_string))
1735 # attempt to extract model directory
1736 try:

File /env/lib/python3.10/pathlib.py:1030, in Path.glob(self, pattern)
1028 if not pattern:
1029 raise ValueError("Unacceptable pattern: {!r}".format(pattern))
-> 1030 drv, root, pattern_parts = self._flavour.parse_parts((pattern,))
1031 if drv or root:
1032 raise NotImplementedError("Non-relative patterns are unsupported")

File /env/lib/python3.10/pathlib.py:74, in _Flavour.parse_parts(self, parts)
72 else:
73 if rel and rel != '.':
---> 74 parsed.append(sys.intern(rel))
75 if drv or root:
76 if not drv:
77 # If no drive is present, try to find one in the previous
78 # parts. This makes the result of parsing e.g.
79 # ("C:", "/", "a") reasonably intuitive.

TypeError: intern() argument must be str, not list

I guess on a similar/related note - does this look correct for loading JSON format definitions directly from a dictionary, e.g. passing the bytes_io object directly to DEFINITION_FILE?

# Example dictionary
data = {"format": "GOT-netcdf", "name": "GOT5.6", "model_file": ["GOT5.5/ocean_tides/2n2.nc", "GOT5.5/ocean_tides/j1.nc", "GOT5.5/ocean_tides/k1.nc", "GOT5.5/ocean_tides/k2.nc", "GOT5.6/ocean_tides/l2.nc", "GOT5.6/ocean_tides/m1.nc", "GOT5.5/ocean_tides/m2.nc", "GOT5.6/ocean_tides/m3.nc", "GOT5.5/ocean_tides/m4.nc", "GOT5.5/ocean_tides/ms4.nc", "GOT5.5/ocean_tides/mu2.nc", "GOT5.6/ocean_tides/n2.nc", "GOT5.5/ocean_tides/o1.nc", "GOT5.5/ocean_tides/oo1.nc", "GOT5.5/ocean_tides/p1.nc", "GOT5.5/ocean_tides/q1.nc", "GOT5.5/ocean_tides/s1.nc", "GOT5.5/ocean_tides/s2.nc", "GOT5.5/ocean_tides/sig1.nc"], "type": "z", "variable": "tide_ocean", "version": "5.6", "scale": 0.01, "compressed": False, "reference": "https://doi.org/10.1126/sciadv.abd4744"}

# Convert dictionary to BytesIO
bytes_io = io.BytesIO(json.dumps(data).encode('utf-8'))


out = tide_elevations(
    x=[122.14],
    y=[-17.9],
    delta_time=pd.date_range("2020", "2021", periods=2),
    DIRECTORY="/gdata1/data/tide_models/",
    MODEL="GOT5.6",
    DEFINITION_FILE=bytes_io,
    DEFINITION_FORMAT="json",
    EPSG=4326,
    TIME="datetime",
    EXTRAPOLATE=True,
    CUTOFF=np.inf,
)

Got it. Having the DIRECTORY argument triggers the glob functionality, so it is trying to search the directory for files. Right now that functionality only works with a single pattern (a string). If you drop the DIRECTORY argument and append /gdata1/data/tide_models/ to the start of each model_file in the definition file it should work. I will update the glob functionality to enable iterating on lists. That will also allow searching for constituent files in multiple directories.

Ah, makes sense - thanks!

One other slight usability improvement could also be auto-guessing the definition file format from the extension/datatype... e.g. if it's a JSON file or bytes_io, it's likely (always?) going to be JSON format. Perhaps the default for DEFINITION_FORMAT could be None or "auto" instead of "ascii", which would use "ascii" under-the-hood for all file extensions other than JSON/bytes_io? (and still allow passing "ascii" and "json" manually too like the current functionality)

That's a great idea! Implemented in #319

Awesome, can confirm that both of these fixes/features work for me! Will do a small validatation of the new GOT models and see how they perform at our tide gauges. ๐Ÿ™‚