MiraGeoscience/geoh5py

Interfacing with geoh5py grid objects

Closed this issue ยท 12 comments

Hello,

I was wondering if there was a quick tutorial on interfacing with geoh5py grid objects? I would like to do some processing with harmonica (https://pypi.org/project/harmonica/) and verde (https://pypi.org/project/verde/) and do some I/O with geoscience analyst.

Cheers!
-Thomas

JIRA issue [GEOPY-239] was created.

HI @ThomasMGeo, agreed we need to beef up our tutorials.
Most of what you need should be in here: https://geoh5py.readthedocs.io/en/latest/content/user_guide/entities.html#Grid2D
with the added general functions to get_data("name") when pulling out of geoh5.

It has been added to our backlog under GEOPY-239 and we will get to it asap

Perfect! if I come up with anything good, I will share.

Cheers,
Thomas

Attached is a notebook and dataset created from Verde. Would be great to get your 2c!

workflow.zip

Down the road, we want to interface with Verde directly through subsurface objects. But that works for now.
See attached, with some mods:
geoh5py_xarray_testing.zip

image

Do you have a code block to go from workspace to xarray?

I have one from workspace to dataframe?

def object_2_dataframe(entity, fields=[], inplace=False, vertices=True, index=None):
    """
    Convert an object to a pandas dataframe
    """
    if getattr(entity, "vertices", None) is not None:
        locs = entity.vertices
    elif getattr(entity, "centroids", None) is not None:
        locs = entity.centroids

    if index is None:
        index = np.arange(locs.shape[0])

    data_dict = {}
    if vertices:
        data_dict["X"] = locs[index, 0]
        data_dict["Y"] = locs[index, 1]
        data_dict["Z"] = locs[index, 2]

    d_f = pd.DataFrame(data_dict, columns=list(data_dict.keys()))
    for field in fields:
        if entity.get_data(field):
            obj = entity.get_data(field)[0]
            if obj.values.shape[0] == locs.shape[0]:
                d_f[field] = obj.values.copy()[index]
                if inplace:
                    obj.values = None

    return d_f

You lose some of the 'grid' aspects, but not hard to re-shape.

Right, that guy is general for any entity technically (grid, points, curves...) Might want to pass along your shape, true.

GA_github.zip
This is pretty rough, but I got there. Might mess around with the function later. Would be great to build in some better I/O with grids to xarray.

You just forgot to add fields to the object_2_dataframe call:
test = object_2_dataframe(grid, fields=["masked_k"])

You can look inside the function to see how it pulls data values out.

Notebook in recent PR address I/O workflow.