What type of orthogonal polynomials does R use?
Asked Answered
C

1

25

I was trying to match the orthogonal polynomials in the following code in R:

X <- cbind(1, poly(x = x, degree = 9))

but in python.

To do this I implemented my own method for giving orthogonal polynomials:

def get_hermite_poly(x,degree):
    #scipy.special.hermite()
    N, = x.shape
    ##
    X = np.zeros( (N,degree+1) )
    for n in range(N):
        for deg in range(degree+1):
            X[n,deg] = hermite( n=deg, z=float(x[deg]) )
    return X

though it does not seem to match it. Does someone know type of orthogonal polynomial it uses? I tried search in the documentation but didn't say.


To give some context I am trying to implement the following R code in python (https://stats.stackexchange.com/questions/313265/issue-with-convergence-with-sgd-with-function-approximation-using-polynomial-lin/315185#comment602020_315185):

set.seed(1234)

N <- 10
x <- seq(from = 0, to = 1, length = N)
mu <- sin(2 * pi * x * 4)
y <- mu
plot(x,y)

X <- cbind(1, poly(x = x, degree = 9))
# X <- sapply(0:9, function(i) x^i)
w <- rnorm(10)

learning_rate <- function(t) .1 / t^(.6)

n_samp <- 2
for(t in 1:100000) {
  mu_hat <- X %*% w
  idx <- sample(1:N, n_samp)
  X_batch <- X[idx,]
  y_batch <- y[idx]
  score_vec <- t(X_batch) %*% (y_batch - X_batch %*% w)

  change <- score_vec * learning_rate(t)
  w <- w + change
}

plot(mu_hat, ylim = c(-1, 1))
lines(mu)
fit_exact <- predict(lm(y ~ X - 1))
lines(fit_exact, col = 'red')
abs(w - coef(lm(y ~ X - 1)))

because it seems to be the only one that works with gradient descent with linear regression with polynomial features.

I feel that any orthogonal polynomial (or at least orthonormal) should work and give a hessian with condition number 1 but I can't seem to make it work in python. Related question: How does one use Hermite polynomials with Stochastic Gradient Descent (SGD)?

Copper answered 4/12, 2017 at 7:4 Comment(0)
C
17

poly uses QR factorization, as described in some detail in this answer.

I think that what you really seem to be looking for is how to replicate the output of R's poly using python.

Here I have written a function to do that based on R's implementation. I have also added some comments so that you can see the what the equivalent statements in R look like:

import numpy as np

def poly(x, degree):
    xbar = np.mean(x)
    x = x - xbar

    # R: outer(x, 0L:degree, "^")
    X = x[:, None] ** np.arange(0, degree+1)

    #R: qr(X)$qr
    q, r = np.linalg.qr(X)

    #R: r * (row(r) == col(r))
    z = np.diag((np.diagonal(r)))  

    # R: Z = qr.qy(QR, z)
    Zq, Zr = np.linalg.qr(q)
    Z = np.matmul(Zq, z)

    # R: colSums(Z^2)
    norm1 = (Z**2).sum(0)

    #R: (colSums(x * Z^2)/norm2 + xbar)[1L:degree]
    alpha = ((x[:, None] * (Z**2)).sum(0) / norm1 +xbar)[0:degree]

    # R: c(1, norm2)
    norm2 = np.append(1, norm1)

    # R: Z/rep(sqrt(norm1), each = length(x))
    Z = Z / np.reshape(np.repeat(norm1**(1/2.0), repeats = x.size), (-1, x.size), order='F')

    #R: Z[, -1]
    Z = np.delete(Z, 0, axis=1)
    return [Z, alpha, norm2];

Checking that this works:

x = np.arange(10) + 1
degree = 9
poly(x, degree)

The first row of the returned matrix is

[-0.49543369,  0.52223297, -0.45342519,  0.33658092, -0.21483446,
  0.11677484, -0.05269379,  0.01869894, -0.00453516],

compared to the same operation in R

poly(1:10, 9)
# [1] -0.495433694  0.522232968 -0.453425193  0.336580916 -0.214834462
# [6]  0.116774842 -0.052693786  0.018698940 -0.004535159
Campanology answered 6/12, 2017 at 23:59 Comment(6)
oh wow thats a long post you linked. Just as a quick question, I thought that using factorizations like qr would artificially decrease the degree of my polynomial. I remember doing a factorization when I had more degree's than data and it decreased my data matrix X to the number of data points. Does your method do that too? I really don't want it do that and its ok I know I have too many degrees, I'm doing that on purpose but even if I have too many degrees/parameters I want the polynomials to still be orthogonal (without changing my number of monomials is crucial).Copper
You can't have more degrees than data. Not sure I really understand how your use case would need that. But probably that is an issue better suited to discussing on stats.stackexchange.com than here.Campanology
what do you mean it can't? Sure you can. Just choose higher degree polynomial. I am not asking you to asses the statistical soundness of it. I am asking if the method you suggested affects the number of monomials I am using. Thanks for ur time btw! :)Copper
I'm not suggesting which particular method you should use. Your question was "what method does poly in R use" and "how can I get the same thing in Python"? Which I tried to show. Can R's algorithm have more degrees than data? Try poly(1:10, degree=11) and see for yourself. For more detailed advice on the related (but different) question of what method best suits your particular application you'll have more luck on cross validated. This site is better for the programming side of the problem, but for fundamental issues about the maths you'll get the attention of more expertise there.Campanology
shouldn't it be np.linalg.qr? without that it didn't work on my sideCopper
Both numpy.linalg.qr and scipy.linalg.qr interface to the same LAPACK routines, so I'm not sure why there would be a difference. But if the numpy function is working better for you, I'll update the answer to use this instead.Campanology

© 2022 - 2024 — McMap. All rights reserved.