Python 4D linear interpolation on a rectangular grid
Asked Answered
R

4

20

I need to interpolate temperature data linearly in 4 dimensions (latitude, longitude, altitude and time).
The number of points is fairly high (360x720x50x8) and I need a fast method of computing the temperature at any point in space and time within the data bounds.

I have tried using scipy.interpolate.LinearNDInterpolator but using Qhull for triangulation is inefficient on a rectangular grid and takes hours to complete.

By reading this SciPy ticket, the solution seemed to be implementing a new nd interpolator using the standard interp1d to calculate a higher number of data points, and then use a "nearest neighbor" approach with the new dataset.

This, however, takes a long time again (minutes).

Is there a quick way of interpolating data on a rectangular grid in 4 dimensions without it taking minutes to accomplish?

I thought of using interp1d 4 times without calculating a higher density of points, but leaving it for the user to call with the coordinates, but I can't get my head around how to do this.

Otherwise would writing my own 4D interpolator specific to my needs be an option here?

Here's the code I've been using to test this:

Using scipy.interpolate.LinearNDInterpolator:

import numpy as np
from scipy.interpolate import LinearNDInterpolator

lats = np.arange(-90,90.5,0.5)
lons = np.arange(-180,180,0.5)
alts = np.arange(1,1000,21.717)
time = np.arange(8)
data = np.random.rand(len(lats)*len(lons)*len(alts)*len(time)).reshape((len(lats),len(lons),len(alts),len(time)))

coords = np.zeros((len(lats),len(lons),len(alts),len(time),4))
coords[...,0] = lats.reshape((len(lats),1,1,1))
coords[...,1] = lons.reshape((1,len(lons),1,1))
coords[...,2] = alts.reshape((1,1,len(alts),1))
coords[...,3] = time.reshape((1,1,1,len(time)))
coords = coords.reshape((data.size,4))

interpolatedData = LinearNDInterpolator(coords,data)

Using scipy.interpolate.interp1d:

import numpy as np
from scipy.interpolate import LinearNDInterpolator

lats = np.arange(-90,90.5,0.5)
lons = np.arange(-180,180,0.5)
alts = np.arange(1,1000,21.717)
time = np.arange(8)
data = np.random.rand(len(lats)*len(lons)*len(alts)*len(time)).reshape((len(lats),len(lons),len(alts),len(time)))

interpolatedData = np.array([None, None, None, None])
interpolatedData[0] = interp1d(lats,data,axis=0)
interpolatedData[1] = interp1d(lons,data,axis=1)
interpolatedData[2] = interp1d(alts,data,axis=2)
interpolatedData[3] = interp1d(time,data,axis=3)

Thank you very much for your help!

Rameriz answered 2/1, 2013 at 9:51 Comment(9)
I have no idea about python, but you should look into quadrilinear or quadricubic interpolation.Mimamsa
Do you have any ideas in any other language about this? ThanksRameriz
Implementing a quadrilinear interpolation for a single point should be fairly easy in any language.Mimamsa
no they aren't. 3 dimensions are constant (.5, .5, 1), but the altitude dimension is not always sampled at the same point hence the grid is not regular on that axis.Rameriz
Not exactly. I have temperature data on a regular grid of lats, lons, PRESSURES and time. However, I my goal is to have a function I can call this way: getTemperature(lat,lon,ALT,time). What I do have is the ALTITUDE data on the same grid as the temperature, so the same regular grid of lats,lons,pressures,time. Right now, my implementation of the method looks up the pressure indices of the two closest altitude values for the given lat,lon,time. Then it uses those altitude values and the two pressure indices to build up the 4D hypertetrahedron in the temperature matrix and interpolate...Rameriz
Is that fast enough, which nd interpolator are you using ?Lei
I am using scipy's interpolate.LinearNDInterpolator. It currently IS fast, but I'm afraid not enough.. Each call to the function takes approx. 1.86ms, but I need it to be run approximately 50,000 times in 5 seconds, i.e. max run time 0.1ms. I've been optimizing the rest of the code to reduce the number of requests, but improving this as well would definitely help. Any suggestions?Rameriz
Well, if LinearND is ~ 20 times too slow, it'll have to go -- difficult to change anything after you've put time into it. It's slow because 1) outer loop in python, 2) it doesn't know that your lat long time are uniform. Now map_coordinates is really fast on uniform grids, < 1 usec see below, so how can you convert your problem to a uniform grid ? Can you preprocess the data, or does it change between queries ? (Time for a new question ?)Lei
Is there any method that also extrapolates?Heart
K
13

In the same ticket you have linked, there is an example implementation of what they call tensor product interpolation, showing the proper way to nest recursive calls to interp1d. This is equivalent to quadrilinear interpolation if you choose the default kind='linear' parameter for your interp1d's.

While this may be good enough, this is not linear interpolation, and there will be higher order terms in the interpolation function, as this image from the wikipedia entry on bilinear interpolation shows:

enter image description here

This may very well be good enough for what you are after, but there are applications where a triangulated, really piecewise linear, interpoaltion is preferred. If you really need this, there is an easy way of working around the slowness of qhull.

Once LinearNDInterpolator has been setup, there are two steps to coming up with an interpolated value for a given point:

  1. figure out inside which triangle (4D hypertetrahedron in your case) the point is, and
  2. interpolate using the barycentric coordinates of the point relative to the vertices as weights.

You probably do not want to mess with barycentric coordinates, so better leave that to LinearNDInterpolator. But you do know some things about the triangulation. Mostly that, because you have a regular grid, within each hypercube the triangulation is going to be the same. So to interpolate a single value, you could first determine in which subcube your point is, build a LinearNDInterpolator with the 16 vertices of that cube, and use it to interpolate your value:

from itertools import product

def interpolator(coords, data, point) :
    dims = len(point)
    indices = []
    sub_coords = []
    for j in xrange(dims) :
        idx = np.digitize([point[j]], coords[j])[0]
        indices += [[idx - 1, idx]]
        sub_coords += [coords[j][indices[-1]]]
    indices = np.array([j for j in product(*indices)])
    sub_coords = np.array([j for j in product(*sub_coords)])
    sub_data = data[list(np.swapaxes(indices, 0, 1))]
    li = LinearNDInterpolator(sub_coords, sub_data)
    return li([point])[0]

>>> point = np.array([12.3,-4.2, 500.5, 2.5])
>>> interpolator((lats, lons, alts, time), data, point)
0.386082399091

This cannot work on vectorized data, since that would require storing a LinearNDInterpolator for every possible subcube, and even though it probably would be faster than triangulating the whole thing, it would still be very slow.

Kwapong answered 2/1, 2013 at 12:58 Comment(7)
Can this be used if the grid changes over time? The altitude points vary as time changes… Thanks again for your help!Rameriz
A quibble on notation: barycentric is a weighted average of 3 points in 2d, 4 in 3d, n+1 in nd; interpolating 4 8 ... 2^n corners of a cube or box is called bilinear, trilinear ... multilinear.Lei
@Denis Your first statement is true, the second is not. The LinearNDInterpolator subdivides the (hyper)cube into disjoint (hyper)tetrahedra tiling the (hyper)cube, and yes, when interpolating for a given point it will only have non-zero weights for the d + 1 vertices of the (hyper)tetrahedron bounding that point, but needs all 2**d vertices to interpolate any point within the cube in a piecewise linear fashion. And there are more ways of interpolating using the 2**d vertices other than multilinear, see this for examples: hpl.hp.com/techreports/98/HPL-98-95.pdfKwapong
You don't agree that "interpolating 4 8 ... 2^n corners of a cube or box is called bilinear, trilinear ? Sure there other ways of weighting the 2^d corners, but all 2^d is simple, and is what interpn( mode="linear") and map_coordinates do.Lei
@Denis I agree that bilinear, trilinear... interpolate using the 4, 8... vertices of a square, cube... But the converse is not true: you can interpolate using the 4, 8... vertices of a square, cube... and not have bilinear, trilinear... interpolation. Bilinear, trilinear is arguably the simplest, most common, way of using all 4, 8... vertices to interpolate every value within the cube, but not the only one.Kwapong
I implemented the tensor product interpolation functionality for arbitrary dimensions in the regulargrid package (Source here).Roughcast
As of SciPy 0.14, interpolation for gridded data is now implemented in interpn() and RegularGridInterpolator. Judging from the source code, both appear to be synonymous.Pyriphlegethon
L
5

scipy.ndimage.map_coordinates is a nice fast interpolator for uniform grids (all boxes the same size). See multivariate-spline-interpolation-in-python-scipy on SO for a clear description.

For non-uniform rectangular grids, a simple wrapper Intergrid maps / scales non-uniform to uniform grids, then does map_coordinates. On a 4d test case like yours it takes about 1 μsec per query:

Intergrid: 1000000 points in a (361, 720, 47, 8) grid took 652 msec
Lei answered 21/1, 2013 at 15:46 Comment(5)
This seems like the obvious answer to me. Am I missing something about the currently accepted answer that map_coordinates can't do?Zabrze
@aaren, map_coordinates does uniform grids only; for non-uniform grids (the OP's question), Jaime suggests one way, intergrid another. Is that your question ?Lei
Sorry, mixed up, I'm reading too many questions at once! Yes, the non-uniformity is the important thing here, which map_coordinates can't deal with as-is.Zabrze
I did some tests on a 3D array (size 60*120*100) on a non-uniform grid, and your method is about 3 times faster than scipy.interpolate.interpn and scipy.interpolate.RegularGridInterpolator.Molten
@Jason, glad it works, spread the word. It's now on PyPi, (February 2020) so pip install [--user] intergrid should work.Lei
C
2

For very similar things I use Scientific.Functions.Interpolation.InterpolatingFunction.

    import numpy as np
    from Scientific.Functions.Interpolation import InterpolatingFunction

    lats = np.arange(-90,90.5,0.5)
    lons = np.arange(-180,180,0.5)
    alts = np.arange(1,1000,21.717)
    time = np.arange(8)
    data = np.random.rand(len(lats)*len(lons)*len(alts)*len(time)).reshape((len(lats),len(lons),len(alts),len(time)))

    axes = (lats, lons, alts, time)
    f = InterpolatingFunction(axes, data)

You can now leave it to the user to call the InterpolatingFunction with coordinates:

>>> f(0,0,10,3)
0.7085675631375401

InterpolatingFunction has nice additional features, such as integration and slicing.

However, I do not know for sure whether the interpolation is linear. You would have to look in the module source to find out.

Commentate answered 19/7, 2013 at 15:7 Comment(0)
S
0

I can not open this address, and find enough informations about this package

Skye answered 11/5, 2020 at 14:49 Comment(3)
Could you elaborate, information for what package are you looking for?Brnaba
Scientific.Functions.Interpolation.InterpolatingFunction, Daan recommand this. I download this package from pypi, but there is nothing useful code in package. Please check pypi.org/project/scientificSkye
The message is a bit old (7 years ago), so the link is no more valid, but looks like sources are on github, and same class can be found here - github.com/khinsen/ScientificPython/blob/master/Scientific/…Brnaba

© 2022 - 2024 — McMap. All rights reserved.