Fitting Parametric Curves in Python
Asked Answered
A

1

7

I have experimental data of the form (X,Y) and a theoretical model of the form (x(t;*params),y(t;*params)) where t is a physical (but unobservable) variable, and *params are the parameters that I want to determine. t is a continuous variable, and there is a 1:1 relationship between x and t and between y and t in the model.

In a perfect world, I would know the value of T (the real-world value of the parameter) and would be able to do an extremely basic least-squares fit to find the values of *params. (Note that I am not trying to "connect" the values of x and y in my plot, like in 31243002 or 31464345.) I cannot guarantee that in my real data, the latent value T is monotonic, as my data is collected across multiple cycles.

I'm not very experienced doing curve fitting manually, and have to use extremely crude methods without easy access to a basic scipy function. My basic approach involves:

  1. Choose some value of *params and apply it to the model
  2. Take an array of t values and put it into the model to create an array of model(*params) = (x(*params),y(*params))
  3. Interpolate X (the data values) into model to get Y_predicted
  4. Run a least-squares (or other) comparison between Y and Y_predicted
  5. Do it again for a new set of *params
  6. Eventually, choose the best values for *params

There are several obvious problems with this approach.

1) I'm not experienced enough with coding to develop a very good "do it again" other than "try everything in the solution space," of maybe "try everything in a coarse grid" and then "try everything again in a slightly finer grid in the hotspots of the coarse grid." I tried doing MCMC methods, but I never found any optimum values, largely because of problem 2

2) Steps 2-4 are super inefficient in their own right.

I've tried something like (resembling pseudo-code; the actual functions are made up). There are many minor quibbles that could be made about using broadcasting on A,B, but those are less significant than the problem of needing to interpolate for every single step.

People I know have recommended using some sort of Expectation Maximization algorithm, but I don't know enough about that to code one up from scratch. I'm really hoping there's some awesome scipy (or otherwise open-source) algorithm I haven't been able to find that covers my whole problem, but at this point I am not hopeful.

import numpy as np
import scipy as sci
from scipy import interpolate

X_data
Y_data

def x(t,A,B):
    return A**t + B**t
def y(t,A,B):
    return A*t + B

def interp(A,B):
    ts = np.arange(-10,10,0.1)
    xs = x(ts,A,B)
    ys = y(ts,A,B)
    f = interpolate.interp1d(xs,ys)
    return f

N = 101
lsqs = np.recarray((N**2),dtype=float)

count = 0
for i in range(0,N):
    A = 0.1*i            #checks A between 0 and 10
    for j in range(0,N):
        B = 10 + 0.1*j   #checks B between 10 and 20

        f = interp(A,B)
        y_fit = f(X_data)
        squares = np.sum((y_fit - Y_data)**2)

        lsqs[count] = (A,b,squares) #puts the values in place for comparison later
        count += 1        #allows us to move to the next cell

i = np.argmin(lsqs[:,2])

A_optimal = lsqs[i][0]
B_optimal = lsqs[i][1]
Abradant answered 21/8, 2015 at 6:27 Comment(2)
You should check out scipy.interpolate.splprepStutz
This is unfortunately only useful for connecting the dots on my (X,Y) data, and not for fitting to find *paramsAbradant
A
0

If I understand the question correctly, the params are constants which are the same in every sample, but t varies from sample to sample. So, for example, maybe you have a whole bunch of points which you believe have been sampled from a circle

x = a+r cos(t)   
y = b+r sin(t)

at different values of t.

In this case, what I would do is eliminate the variable t to get a relation between x and y -- in this case, (x-a)^2+(y-b)^2 = r^2. If your data fit the model perfectly, you would have (x-a)^2+(y-b)^2 = r^2 at each of your data points. With some error, you could still find (a,b,r) to minimize

sum_i ((x_i-a)^2 + (y_i-b)^2 - r^2)^2.

Mathematica's Eliminate command can automate the procedure of eliminating t in some cases.

PS You might do better at stats.stackexchange, math.stackexchange or mathoverflow.net . I know the last one has a scary reputation, but we don't bite, really!

Agneta answered 12/4, 2017 at 1:38 Comment(1)
Thanks for the feedback. I am trying to do exactly what you suggest - find the equivalents of a and b. However, the equations don't always solve analytically for t, and when they do, they end up with a number of singularities. When I keep it in terms of t, I only have singularities at t=0. However, when I eliminate t, I get parts of the equations that look like (some constants taken out for simplicity) y - x + c' = y/x + (y/(x-y))^(1/3) This means that relatively small errors in x can overwhelm the fitting procedure.Abradant

© 2022 - 2024 — McMap. All rights reserved.