Multivariate spline interpolation in python/scipy?
Asked Answered
C

2

31

Is there a library module or other straightforward way to implement multivariate spline interpolation in python?

Specifically, I have a set of scalar data on a regularly-spaced three-dimensional grid which I need to interpolate at a small number of points scattered throughout the domain. For two dimensions, I have been using scipy.interpolate.RectBivariateSpline, and I'm essentially looking for an extension of that to three-dimensional data.

The N-dimensional interpolation routines I have found are not quite good enough: I would prefer splines over LinearNDInterpolator for smoothness, and I have far too many data points (often over one million) for, e.g., a radial basis function to work.

If anyone knows of a python library that can do this, or perhaps one in another language that I could call or port, I'd really appreciate it.

Croce answered 4/6, 2011 at 17:21 Comment(1)
Just to make sure I'm understanding things correctly, your data is already on a regular grid and you want to interpolate at irregular points? (If so, you want scipy.ndimage.map_coordinates, I'll post an example in a bit...)Dyewood
D
47

If I'm understanding your question correctly, your input "observation" data is regularly gridded?

If so, scipy.ndimage.map_coordinates does exactly what you want.

It's a bit hard to understand at first pass, but essentially, you just feed it a sequence of coordinates that you want to interpolate the values of the grid at in pixel/voxel/n-dimensional-index coordinates.

As a 2D example:

import numpy as np
from scipy import ndimage
import matplotlib.pyplot as plt

# Note that the output interpolated coords will be the same dtype as your input
# data.  If we have an array of ints, and we want floating point precision in
# the output interpolated points, we need to cast the array as floats
data = np.arange(40).reshape((8,5)).astype(np.float)

# I'm writing these as row, column pairs for clarity...
coords = np.array([[1.2, 3.5], [6.7, 2.5], [7.9, 3.5], [3.5, 3.5]])
# However, map_coordinates expects the transpose of this
coords = coords.T

# The "mode" kwarg here just controls how the boundaries are treated
# mode='nearest' is _not_ nearest neighbor interpolation, it just uses the
# value of the nearest cell if the point lies outside the grid.  The default is
# to treat the values outside the grid as zero, which can cause some edge
# effects if you're interpolating points near the edge
# The "order" kwarg controls the order of the splines used. The default is 
# cubic splines, order=3
zi = ndimage.map_coordinates(data, coords, order=3, mode='nearest')

row, column = coords
nrows, ncols = data.shape
im = plt.imshow(data, interpolation='nearest', extent=[0, ncols, nrows, 0])
plt.colorbar(im)
plt.scatter(column, row, c=zi, vmin=data.min(), vmax=data.max())
for r, c, z in zip(row, column, zi):
    plt.annotate('%0.3f' % z, (c,r), xytext=(-10,10), textcoords='offset points',
            arrowprops=dict(arrowstyle='->'), ha='right')
plt.show()

enter image description here

To do this in n-dimensions, we just need to pass in the appropriate sized arrays:

import numpy as np
from scipy import ndimage

data = np.arange(3*5*9).reshape((3,5,9)).astype(np.float)
coords = np.array([[1.2, 3.5, 7.8], [0.5, 0.5, 6.8]])
zi = ndimage.map_coordinates(data, coords.T)

As far as scaling and memory usage goes, map_coordinates will create a filtered copy of the array if you're using an order > 1 (i.e. not linear interpolation). If you just want to interpolate at a very small number of points, this is a rather large overhead. It doesn't increase with the number points you want to interpolate at, however. As long as have enough RAM for a single temporary copy of your input data array, you'll be fine.

If you can't store a copy of your data in memory, you can either a) specify prefilter=False and order=1 and use linear interpolation, or b) replace your original data with a filtered version using ndimage.spline_filter, and then call map_coordinates with prefilter=False.

Even if you have enough ram, keeping the filtered dataset around can be a big speedup if you need to call map_coordinates multiple times (e.g. interactive use, etc).

Dyewood answered 4/6, 2011 at 19:17 Comment(1)
Any remote possibility of splines for unstructured 3D data (asked here)?Militant
A
3

Smooth spline interpolation in dim > 2 is difficult to implement, and so there are not many freely available libraries able to do that (in fact, I don't know any).

You can try inverse distance weighted interpolation, see: Inverse Distance Weighted (IDW) Interpolation with Python . This should produce reasonably smooth results, and scale better than RBF to larger data sets.

Amenable answered 4/6, 2011 at 18:13 Comment(10)
IDW is a terrible choice in almost every case. It assumes that all of your input data points are local minimums or maximums! I'm not sure at all people use it as a general interpolation technique... (Lots of people certainly do, though!) Unless you specifically want "bullseye" patters around your observation points, it's not what you want. /rant Regardless, I think the OP has regularly gridded data that they want to get point interpolations at, rather than interpolating irregular points. There are better options in that case.Dyewood
Incidentally, I didn't mean to sound quite so rude... I just have a vendetta against IDW as an interpolation method! :)Dyewood
@Joe Kington, (missed this among your zillions of good answers and comments): What methods would you suggest for interpolating scattered / non-uniform data in 2d, 5d, 10d ? In 2d / 3d one can triangulate, but 10d ?Monitor
@Denis - Thanks :) Radial Basis Functions can be slow, but they're smooth and scale well to high dimensions. As long as you're not optimizing by using only the N nearest points, you don't need to worry about triangulation at all. The only parameter is distance, regardless of dimensionality. Hope that helps, at any rate.Dyewood
@Joe Kington, Hi Joe! Thanks for all the python guidance! I'm confronting the same issue as the OP here, and wondering what the best option amongst: (1) scipy.interpolate.interpn, (2) scipy.interpolate.RegularGridInterpolator, and (3) scipy.ndimage.interpolation.map_coordinates is. Are any of these methods equivalent, possibly identical behind the scenes, or have any particular advantages w.r.t speed, memory, and/or accuracy?Jonell
It seems (1) and (2) use piecewise linear interpolation, and (3) uses splines (order 1 in my case, but that's still distinct from piecewise linear because the splines have to be differentiable at the knots?). Also, it seems (1) returns the interpolated values and (2) returns a function that can be evaluated repeatedly. Sanity check: so, if you need splines use (3). If you are constantly interpolating the same data, but piecewise linear is ok, then you probably want (2). If you're just interpolating the data once, and piecewise linear is ok, then (1) is sufficient. Does that sound right?Jonell
@Jonell - Actually, interpn will use splines, if you specify it. It uses RectBivariateSpline in that case, which is usually considerably slower than map_coordinates. Otherwise, interpn is just a thin wrapper around RegularGridInterpolator. Also, for the functions you're working with, an order=1 spline is exactly equivalent to piecewise linear interpolation. On a side note, for regularly-gridded input, the type of interpolation usually isn't terribly important. It's most critical when interpolating a few "sparse" values.Dyewood
@Jonell - As far as which to use, I tend to use map_coordinates the most. That's simply because it's been around the longest and I know it very well (I think interpn was just added in the last release of scipy, and RegularGridInterpolator hasn't been around much longer). The other functions have a much more sane interface, however. They're certainly easier to understand, especially if you're using them for the first time. I could be wrong, but RegularGridInterpolator should be more efficient than map_coordinates for small numbers of interpolated points.Dyewood
@Joe Kington, Thanks again Joe. The "functions" I'm interpolating actually happen to be 3D MRI images of the human brain. Order 1 splines and piecewise linear interpolation may still be the same there, I'll just have to read more about the theory of splines to prove that to myself. I actually found the map_coordinates interface to be reasonably intuitive (with the help from your explanation). I'm interpolating a number of points equal to the number of voxels in my original image (something like 200**3), so map_coordinates may be better in the long run.Jonell
For the record, it looks like interpn and RegularGridInterpolator does not use splines for N>2 (and the latter not at all). For spline interpolation in dimension greater than 2, it seems like map_coordinate is the only option.Febri

© 2022 - 2024 — McMap. All rights reserved.