I spent over a week trying to do essentially the same thing. I tried a whole bunch of optimization stuff to fine tune the coefficients with basically no success, then I found out that there is a closed form solution and it works really well.
Disclaimer: I was trying to fit data with a fixed maximum order of magnitude. If there is no limit to your E1, E2, etc values, then this won't work for you.
Now that I've taken the time to learn this stuff, I actually see that some of the answers would have given good hints if I understood them. It had also been a while since my last statistics and linear algebra class.
So if there are other people out there who are lacking the linear algebra knowledge, here's what I did.
Even though this is not a linear function you are trying to fit, it turns out that this is still a linear regression problem. Wikipedia has a really good article on linear regression. I recommend reading it slowly: https://en.wikipedia.org/wiki/Linear_regression#:~:text=In%20statistics%2C%20linear%20regression%20is,as%20dependent%20and%20independent%20variables). It also links a lot of other good related articles.
If you don't know how to do a simple (single variable) linear regression problem using matrices, take some time to learn how to do that.
Once you learn how to do simple linear regression, then try some multivariable linear regression. Basically, to do multi variable linear regression, you create an X matrix where there is a row for each of your input data items and each row contains all of the variable values for that data entry (plus a 1 in the last column which is used for the constant value at the end of your polynomial (called an intercept)). Then you create a Y matrix that is a single column with a row for each data item. Then you solve B = (XTX)-1XTY. B then becomes all of the coefficients for your polynomial.
For multi-variable polynomial regression, its the same idea, just now you have a huge multi-variable linear regression where each regressor (variable you're doing regression on) is a coefficient for your giant polynomial expression.
So if your input data looks like this:
Inputs |
Output |
a1, b1, |
y1 |
a2, b2, |
y2 |
... |
... |
aN, bN, |
yN |
And you want to fit a 2nd order polynomial of the form y = c1a^2b^2 + c2a^2b + c3a^2 + c4ab^2 + c5ab + c6a + c7b^2 + c8b + c9, then your X matrix will look like:
|
|
|
|
|
|
|
|
|
a1^2*b1^2 |
a1^2*b1 |
a1^2 |
a1*b1^2 |
a1*b1 |
a1 |
b1^2 |
b1 |
1 |
a2^2*b2^2 |
a2^2*b2 |
a2^2 |
a2*b1^2 |
a2*b2 |
a2 |
b2^2 |
b2 |
1 |
... |
... |
... |
... |
... |
... |
... |
... |
... |
aN^2*bN^2 |
aN^2*bN |
aN^2 |
aN*bN^2 |
aN*bN |
aN |
bN^2 |
bN |
1 |
Your Y matrix is simply:
Then you do B = (XTX)-1XTY and then B will equal
|
c1 |
c2 |
c3 |
c4 |
c5 |
c6 |
c7 |
c8 |
c9 |
Note that the total number of coefficients will be (o + 1)V where o is the order of the polynomial and V is the number of variables, so it grows pretty quickly.
If you are using good matrix code, then I believe the runtime complexity will be O(((o+1)V)3 + ((o + 1)V)2N), where V is the number of variables, o is the order of the polynomial, and N is the number of data inputs you have. Initially this sounds pretty terrible, but in most cases, o and V are probably not going to be high, so then this just becomes linear with respect to N.
Note that if you are writing your own matrix code, then it is important to make sure that your inversion code uses something like an LU decomposition. If you use a naïve inversion approach (like I did at first) then that ((o+1)V)3 becomes ((o+1)V)!, which was way worse. Before I made that change, I predict that my 5th order 3 variable polynomial would take roughly 400 google millennia to complete. After using LU decomposition, it takes about 7 seconds.
Another disclaimer
This approach requires that (XTX) not be a singular matrix (in other words, it can be inverted). My linear algebra is a little rough so I don't know all of the cases where that would occur, but I know that it occurs when there is perfect multi-collinearity between input variables. That means one variable is just another factor multiplied by a constant (for example, one input is number of hours to complete a project and another is dollars to complete a project, but the dollars are just based on an hourly rate times the number of hours).
The good news is that when there is perfect multi-collinearity, you'll know. You'll end up with a divide by zero or something when you are inverting the matrix.
The bigger problem is when you have imperfect multi-collinearity. That's when you have two closely related but not perfectly related variables (such as temperature and altitude, or speed and mach number). In those cases, this approach still works in theory, but it becomes so sensitive that small floating point errors can cause the result to be WAY off.
In my observations, however, the result is either really good or really bad, so you could just set some threshold on your mean squared error and if its over that, then say "couldn't compute a polynomial".
keras+tensorflow
to approximate in batches & to use Statistical Inference, notscipy.optimize
– Variegation