DHI/mikeio

Performance issue with find_index on Grid2D

JacobGudbjerg opened this issue · 4 comments

Describe the bug
I find that it is ~15 times slower to use find_index(x=50,y=50) than find_index(x=50) or find_index(y=50) on Grid2D

To Reproduce
Below I have inserted code with some timers

import timeit
import mikeio
g=mikeio.Grid2D(dx=500, dy=500, bbox=[0,0,1000,1000])
r=range(0,5000)
times=[]
t=timeit.default_timer()
for i in r:
    g.find_index(x=50)
times.append(timeit.default_timer()-t)

t=timeit.default_timer()
for i in r:
    g.find_index(y=50)
times.append(timeit.default_timer()-t)

t=timeit.default_timer()
for i in r:
    g.find_index(x=50,y=50)
times.append(timeit.default_timer()-t)
times

Expected behavior
I would expect that finding indeces of x and y simultaneously would take the same time as finding them individually

Workaround
Find them individually

System information:

  • Python version 3.10
  • MIKE IO version 1.4.0

In order to understand your typical use case:

Do you need to find a single point, a handful or many points in the same grid?

I need to find ~200.000 points in a 500 m grid covering Denmark

I need to find ~200.000 points in a 500 m grid covering Denmark

Good, to know. That is slightly different than the typical use cases I am used to, but should still be doable.

In that case, it makes sense to supply coordinates as a single numpy array.

image

Copy/pasteable code

import numpy as np
import mikeio

g = mikeio.Grid2D(bbox=[0, 0, 1, 5], dx=0.2)
np.random.seed(0)
x = np.random.uniform(0, 1, size=100_000)
y = np.random.uniform(0, 5, size=100_000)
xy = np.column_stack([x, y])
ii, jj = g.find_index(coords=xy)