QR decomposition different in lm and biglm?

I'm trying to recover the R matrix from the QR decomposition used in biglm. For this I am using a portion of the code in vcov.biglm and put it into a function like so:

qr.R.biglm <- function (object, ...) {
    # Return the qr.R matrix from a biglm object
    object$qr <- .Call("singcheckQR", object$qr)
    p <- length(object$qr$D)
    R <- diag(p)
    R[row(R) > col(R)] <- object$qr$rbar
    R <- t(R)
    R <- sqrt(object$qr$D) * R
    dimnames(R) <- list(object$names, object$names)
    return(R)
}

More specifically, I'm trying to get the same result as using qr.R from the base package, which is used on QR decompositions of class "qr" such as those contained in the lm class (lm$qr). The code for the base function is as follows:

qr.R <- function (qr, complete = FALSE) {
    if (!is.qr(qr)) 
        stop("argument is not a QR decomposition")
    R <- qr$qr
    if (!complete) 
        R <- R[seq.int(min(dim(R))), , drop = FALSE]
    R[row(R) > col(R)] <- 0
    R
}

I manage to get the same result for a sample regression, except for the signs.

x <- as.data.frame(matrix(rnorm(100 * 10), 100, 10))
y <- seq.int(1, 100)
fit.lm <- lm("y ~ .", data =  cbind(y, x))
R.lm <- qr.R(fit.lm$qr)

library(biglm)
fmla <- as.formula(paste("y ~ ", paste(colnames(x), collapse = "+")))
fit.biglm <- biglm(fmla, data = cbind(y, x))
R.biglm <- qr.R.biglm(fit.biglm)

Comparing both, it's clear that the absolute values match, but not the signs.

mean(abs(R.lm) - abs(R.biglm) < 1e-6)
[1] 1
mean(R.lm - R.biglm < 1e-6)
[1] 0.9338843

I can't quite figure out why this is. I would like to be able to get the same result for the R matrix as lm from biglm.

The difference between the two R matrices is that biglm apparently performs its rotations such that R's diagonal elements are all positive, while lm (or, really, the routines it calls) imposes no such constraint. (There should be no numerical advantage to one strategy or the other, so the difference is just one of convention, AFAIKT.)

You can make lm's results identical to biglm's by imposing that additional constraint yourself. I'd use a reflection matrix that multiplies columns by either 1 or -1, such that the diagonal elements all end up positive:

## Apply the necessary reflections
R.lm2 <- diag(sign(diag(R.lm))) %*% R.lm

## Show that they did the job
mean(R.lm2 - R.biglm < 1e-6)
# [1] 1

Recommended topics

Hot tags