Is there a Python equivalent to the smooth.spline function in R
Asked Answered
O

3

20

The smooth.spline function in R allows a tradeoff between roughness (as defined by the integrated square of the second derivative) and fitting the points (as defined by summing the squares of the residuals). This tradeoff is accomplished by the spar or df parameter. At one extreme you get the least squares line, and the other you get a very wiggly curve which intersects all of the data points (or the mean if you have duplicated x values with different y values)

I have looked at scipy.interpolate.UnivariateSpline and other spline variants in Python, however, they seem to only tradeoff by increasing the number of knots, and setting a threshold (called s) for the allowed SS residuals. By contrast, the smooth.spline in R allows having knots at all the x values, without necessarily having a wiggly curve that hits all the points -- the penalty comes from the second derivative.

Does Python have a spline fitting mechanism that behaves in this way? Allowing all knots but penalizing the second derivative?

Obeded answered 27/3, 2015 at 23:56 Comment(1)
Possible duplicate of Python natural smoothing splinesFaddish
A
4

You can use R functions in Python with rpy2:

import rpy2.robjects as robjects
r_y = robjects.FloatVector(y_train)
r_x = robjects.FloatVector(x_train)

r_smooth_spline = robjects.r['smooth.spline'] #extract R function# run smoothing function
spline1 = r_smooth_spline(x=r_x, y=r_y, spar=0.7)
ySpline=np.array(robjects.r['predict'](spline1,robjects.FloatVector(x_smooth)).rx2('y'))
plt.plot(x_smooth,ySpline)

If you want to directly set lambda: spline1 = r_smooth_spline(x=r_x, y=r_y, lambda=42) doesn't work, because lambda has already another meaning in Python, but there is a solution: How to use the lambda argument of smooth.spline in RPy WITHOUT Python interprating it as lambda.

To get the code running you first need to define the data x_train and y_train and you can define x_smooth=np.array(np.linspace(-3,5,1920)). if you want to plot it between -3 and 5 in Full-HD-resolution.

Note that this code is not fully compatible with Jupyter-notebooks for the latest versions of rpy2. You can fix this by using !pip install -Iv rpy2==3.4.2 as described in NotImplementedError: Conversion 'rpy2py' not defined for objects of type '<class 'rpy2.rinterface.SexpClosure'>' only after I run the code twice

Ashur answered 4/10, 2019 at 15:13 Comment(0)
N
1

I've been looking for exactly the same thing, but would rather not have to translate the code to Python. The Splinter package seems like an option, however: https://github.com/bgrimstad/splinter

Noreennorene answered 21/5, 2016 at 22:10 Comment(0)
I
1

From research on google, I concluded that

By contrast, the smooth.spline in R allows having knots at all the x values, without necessarily having a wiggly curve that hits all the points -- the penalty comes from the second derivative.

Ivar answered 9/7, 2019 at 13:7 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.