From what I can tell there's a lot of legacy baggage here, and we should not be using numpy.polyfit
, and we should prefer numpy.polynomial.polynomial.Polynomial.fit
.
Consider comments on this github issue from 2016:
While the documentation is reasonably clear in noting that coefficients are returned [from numpy.polyfit
—ed.] with highest-order last, this is fairly easy to miss, and is inconsistent with, e.g. numpy.polynomial.polynomial.polyfit()
.
And a bit later
Having the zero-degree coefficient first, as done in numpy.polynomial.polynomial.polyfit
is definitely more logical. I was under the impression that the only reason numpy.polyfit
deviates from this is historical accident, which of course is nearly impossible to correct now since many programmes may depend on this behaviour. Maybe the easiest solution would be to point people to the "preferred" solution in numpy.polyfit
?
From an earlier comment it's evident that the "historical accident" is the behaviour of MATLAB's polyfit
that takes high orders first. Early numpy kept this confusing convention (which it may have even inherited from a predecessor of the project), but later numpy.polynomial.polynomial.polyfit
was implemented to Do It Right™. The crucial difference is that (unlike MATLAB) python uses 0-based indices, in which case it's perfectly natural to have zeroth order first. With this convention there's the beautiful property that item k
corresponds to the term x**k
.
Then there's a newer account in another issue from this year that tries to give a more coherent picture. Quoting the historical recollection from the issue:
History
(not necessarily in chronological order)
- A certain JVM-based linear algebra package had a function,
polyfit
, for fitting polynomials which made some weird design choices,
like returning the coefficients highest-degree first.
- numpy, in an attempt to support fugitives from said environment, created the function
numpy.polyfit
which aped that design choice
- numpy implemented
numpy.ma.polyfit
for masked arrays, using numpy.polyfit
- In an attempt to fix the mistakes of history, numpy created the function
numpy.polynomial.polynomial.polyfit
with almost exactly the
same signature, but with a more sensible coefficient ordering, and
quietly preferred that people use that instead
- People were confused by these two very similar functions (#7478); also the new function could not return a covariance matrix and it did
not have a masked array counterpart
- Powering on towards both API nirvana and worn-out keyboards, numpy introduced the
numpy.polynomial.polynomial.Polynomial
class, and
documented in numpy.polyfit
that that was the preferred way of fitting
polynomials, although that also had no masked implementation and also
did not return a covariance matrix
The responses from devs on the two issues make it clear that numpy.polyfit
is technical debt, and as its documentation says new code should use the Polynomial
class. The documentations have improved a lot since 2016, in that now there are pointers from numpy.polyfit
to Polynomial
, but there's still a lot of ambiguity. Ideally both polyfit
methods should explain their situation with respect to the other, and point users to the Polynomial
class as the one obvious way to write new code.