Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

feature: speed-up resample_to_grid #2169

Open
cneyens opened this issue Apr 26, 2024 · 2 comments
Open

feature: speed-up resample_to_grid #2169

cneyens opened this issue Apr 26, 2024 · 2 comments
Milestone

Comments

@cneyens
Copy link

cneyens commented Apr 26, 2024

Is your feature request related to a problem? Please describe.

The resample_to_grid method allows for the resampling of raster data to the MODFLOW grid. Currently supported raster formats are GeoTiff, ASCII Grid (ESRI ASCII), and Erdas Imagine .img, per the notebook describing its functionality. These formats are rectilinear and have a regular x and/or y spacing. The interpolation algorithm for bilinear, bicubic and nearest-neighbour interpolation which is used in the underlying code (scipy.interpolate.griddata) is intended for unstructured data. We found that the scipy.interpolate.interpn algorithm for rectilinear grids outperforms the former, especially for bilinear and nearest-neighbour interpolation, with a speed-up of several orders of magnitude on my machine (albeit from a rudimentary benchmark; see notebook attached below). For bilinear interpolation from a 1x1 m input raster, interpn is about 10 000 times faster.

Describe the solution you'd like

When the method in resample_to_grid is nearest, bilinear or cubic and the raster input format is rectilinear, use scipy.interpolate.interpn instead of scipy.interpolate.griddata.

Describe alternatives you've considered

The current method scipy.interpolate.griddata, which is slower for rectilinear raster data.

Additional context

Notebook with rudimentary benchmarking for relatively fine structured and vertex grids and rectangular input data at different resolutions, on a Windows 10 i9 3.50 GHz machine using flopy v3.6.0 and scipy v1.13.0. This needs verification.

resample_benchmark.zip

@dbrakenhoff
Copy link
Contributor

Sounds like a good idea :)!

I had to modify your code a bit to get the same result between both optimization methods, but after that change they produce nearly the same result:
image

I changed the following, not sure if these are logical changes but at least they produce the same output:
Flipping y in create_raster and changing the dimensions of v:

    y = np.arange(0, Ly + 10, delta)[::-1]
    # ...
    v = np.random.rand(ny, nx)

Switching y, and x in interpn:

ip1 = sp.interpolate.interpn((y, x), v, (yc, xc), method=m)

Weirdly, i get one cell where the difference != 0 but i guess that's some edge case that both methods handle differently...? See bottom left corner:
image

@cneyens
Copy link
Author

cneyens commented Apr 26, 2024

Yes, interpn requires marginal vectors instead of grid points, so your change is necessary to obtain similar (i.e. correct interpolation) results. I only tested for performance since I do not expect exactly the same results from the different algorithms.

I can reproduce the single cell difference in the lowerleft corner for the vertex grid using nearest-neighbour interpolation. I guess it's indeed an edge case depending on the geometry of the voronoi grid. Note that for bilinear interpolation, the results do differ more significantly:

resample_linear_vertex

@wpbonelli wpbonelli added this to the 3.7.0 milestone May 1, 2024
@wpbonelli wpbonelli modified the milestones: 3.7.0, 3.8.0 May 23, 2024
@wpbonelli wpbonelli modified the milestones: 3.8.0, 3.9.0 Aug 5, 2024
@wpbonelli wpbonelli modified the milestones: 3.9, 3.10 Sep 4, 2024
# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

No branches or pull requests

3 participants