Shaken faith in `qr()`

Asked 21/6, 2022 at 15:44 Answered 22/6, 2022 at 0:41

Solved r matrix matrix-decomposition qr-decomposition

I have relied on the qr() function a lot in dealing with rank-deficient situations, but have recently run into some examples where it doesn't work correctly. Consider the matrix badX below:

badX <-
structure(c(-1.641906809157e-10, 0, 0, 0, 0, -0.5, 0, 0, -1.10482935525559e-16, 
            0, -3.06266685765538e-17, 0, -4.83736007092039e-17, 0, -3.14414492582296e-18, 
            -3.06158275836099e-18), dim = c(4L, 4L), dimnames = list(c("(Intercept)", 
            "A2", "A3", "B2"), NULL))

We cannot invert this matrix using the solve():

solve(badX)
## Error in solve.default(badX): system is computationally singular: reciprocal condition number = 5.55308e-18

Yet qr() and its associated routines thinks this matrix has a rank of 4 and it can invert it:

qr(badX)$rank
## [1] 4

qr.solve(badX)
##             [,1] [,2]          [,3]          [,4]
## [1,] -6090479645    0  2.197085e+10  7.366741e+10
## [2,]           0   -2  0.000000e+00  0.000000e+00
## [3,]           0    0 -3.265128e+16  3.353179e+16
## [4,]           0    0  0.000000e+00 -3.266284e+17

This is a pretty ugly result. I have tried varying the tol argument, with no change in results.

For context, the origin of this result is this contrast matrix:

badL <-
structure(c(0, 0, 0, 0, 0, -9.89189274870351e-11, 0, -5.55111512312578e-17, 
    -2.77555756156289e-17, 1.11022302462516e-16, 0, 0, 0, 0, 0, 0, 
    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, -0.25, 0, 0, 0, 0, -0.25, 0, 0, 
    0, 9.89189274870351e-11, 0, 5.55111512312578e-17, 2.77555756156289e-17, 
    -1.11022302462516e-16, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
    0, 0, -4.23939184015843e-11, 0, -4.16333634234434e-17, -1.38777878078145e-17, 
    5.55111512312578e-17, 0, 0, 0, 0, 0, -4.23939184015843e-11, 0, 
    -4.16333634234434e-17, -1.38777878078145e-17, 5.55111512312578e-17, 
    0, 0, 0, 0, 0, 0, 0.25, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.25, 0, 0, 
    0, 0, 0, 0, 0, 0, 4.23939184015843e-11, 0, 4.16333634234434e-17, 
    1.38777878078145e-17, -5.55111512312578e-17, 0, 0, 0, 0, 0, -1.41313127284714e-11, 
    0, -6.93889390390723e-18, -6.93889390390723e-18, 1.38777878078145e-17, 
    4.23939184015843e-11, 0, 4.16333634234434e-17, 1.38777878078145e-17, 
    -5.55111512312578e-17, 0, 0, 0, 0, 0), dim = c(5L, 24L), dimnames = list(
    NULL, c("(Intercept)", "A2", "A3", "B2", "B3", "C2", "C3", 
    "A2:B2", "A3:B2", "A2:B3", "A3:B3", "A2:C2", "A3:C2", "A2:C3", 
    "A3:C3", "B2:C2", "B3:C2", "B2:C3", "B3:C3", "A2:B2:C2", 
    "A3:B2:C2", "A3:B3:C2", "A2:B2:C3", "A3:B2:C3")))

... from which I obtained the QR decomposition of its transpose, to find that it is supposedly of rank 4:

badQR <- qr(t(badL))
badQR$rank
## [1] 4

The above matrix badX is equal to qr.R(badQR)[1:4, 1:4] which based on the rank calculation, was supposed to be a full-rank upper-triangular matrix.

My remedy seems to be to use zapsmall() so that I get the rank right...

qr(zapsmall(t(badL)))$rank
## [1] 1

My question is, why does this happen? If you look at badL, it's pretty clear that it has three zero rows and only the second row is nonzero. I would have thought that qr()'s pivoting methods would work better with this. Is there a better way to get more reliable code?

I am running Windows 11 Pro, version 10.0.22000 build 22000. Here's my R system information.

R.Version()
## $platform
## [1] "x86_64-w64-mingw32"
## 
## $arch
## [1] "x86_64"
## 
## $os
## [1] "mingw32"
## 
## $crt
## [1] "ucrt"
## 
## $system
## [1] "x86_64, mingw32"
## 
## $status
## [1] ""
## 
## $major
## [1] "4"
## 
## $minor
## [1] "2.0"
## 
## $year
## [1] "2022"
## 
## $month
## [1] "04"
## 
## $day
## [1] "22"
## 
## $`svn rev`
## [1] "82229"
## 
## $language
## [1] "R"
## 
## $version.string
## [1] "R version 4.2.0 (2022-04-22 ucrt)"
## 
## $nickname
## [1] "Vigorous Calisthenics"

^{Created on 2022-06-21 by the reprex package (v2.0.1)}

More on context

This question came up because I was trying to produce results like this (for a simpler example) in the emmeans package:

> (jt = joint_tests(warpx.emm))
 model term   df1 df2 F.ratio p.value note
 tension        1  37   5.741  0.0217    e
 wool:tension   1  37   5.867  0.0204    e
 (confounded)   2  37   7.008  0.0026  d e

d: df1 reduced due to linear dependence 
e: df1 reduced due to non-estimability

... and in particular, the (confounded) part. This example is with a two-factor model with wool at 2 levels and tension at 3 levels; however, one of the factor combinations is omitted in the data, which means that we can estimate only 1 d.f. for each of the tension main effect and the wool:tension interaction effect, and no main effect at all for wool. There being 4 d.f. for all possible contrasts of the 5 nonempty cells, there are 2 d.f. left over, and those are in the confounded) part.

The computation is based on the associated estimable functions:

> attr(jt, "est.fcns")
$tension
     (Intercept) woolB tensionM tensionH woolB:tensionM woolB:tensionH
[1,]           0     0        1        0            0.5              0

$`wool:tension`
     (Intercept) woolB tensionM tensionH woolB:tensionM woolB:tensionH
[1,]           0     0        0        0              1              0

$`(confounded)`
     (Intercept) woolB tensionM tensionH woolB:tensionM woolB:tensionH
[1,]           0    -1        0        0              0              0
[2,]           0     1        0        0              0              0
[3,]           0    -1        0        0              0              0
[4,]           0    -1        0        1              0              0

... and on the contrasts among all cells in the design:

> contrast(warpx.emm, "consec")@linfct
     (Intercept) woolB tensionM tensionH woolB:tensionM woolB:tensionH
[1,]           0     1        0        0              0              0
[2,]           0    -1        1        0              0              0
[3,]           0     1        0        0              1              0
[4,]           0    -1       -1        1             -1              0
[5,]           0     1        0        0              0              1

The method I use is to combine the estimable functions for tension and wool:tension, and obtain the QR decomposition of its transpose. Then I use qr.resid() with that and the transpose of the above cell contrasts. That leaves us (after transposing yet again) with the estimable functions shown for (confounded). That matrix has 4 rows, but its rank is only 2, as determined by the QR decomposition of this result; then I extract the 2x2 portion of the R part to complete the computation of the F statistic.

The example at the beginning of this question is similar, but with a larger, more complex model; the badL matrix is the result of the qr.resid() procedure described above. In this context, some of those rows arguably should be zero. My workaround at present is to examine the diagonal elements of R (badR in the OP) and select those that exceed an absolute threshold.

The essential idea here is that I need to decompose that matrix of all contrasts into two parts -- the known estimable functions and the leftovers. And an interesting aspect of this is that the rank of the latter part is known, a fact that I have not taken advantage of. In future development, it may well be, per @duffymo, to use a SVD rather than these gyrations with qr.resid(). There's always new stuff to learn...

Gallfly answered 21/6, 2022 at 15:44 Comment(0)

You are complaining that solve can not invert a matrix that seems to be full rank (according to qr). And you think that solve is doing the correct thing, while qr is not.

Well, do not trust solve. It is not a robust numerical procedure and we can fool it easily. Here is a diagonal matrix. It is certainly invertible (by simply inverting its diagonal elements), but solve just can't do it.

D <- diag(c(1, 1e-20))
#     [,1]  [,2]
#[1,]    1 0e+00
#[2,]    0 1e-20

solve(D)
#Error in solve.default(D) : 
#  system is computationally singular: reciprocal condition number = 1e-20

Dinv <- diag(c(1, 1e+20))

## an identity matrix, as expected
D %*% Dinv
#     [,1] [,2]
#[1,]    1    0
#[2,]    0    1

## an identity matrix, as expected
Dinv %*% D
#     [,1] [,2]
#[1,]    1    0
#[2,]    0    1

Now let's look at your badX, which I call R (as it is the upper triangular matrix returned by QR factorization).

R <-
structure(c(-1.641906809157e-10, 0, 0, 0, 0, -0.5, 0, 0, -1.10482935525559e-16, 
            0, -3.06266685765538e-17, 0, -4.83736007092039e-17, 0, -3.14414492582296e-18, 
            -3.06158275836099e-18), dim = c(4L, 4L))

solve can not invert it, but qr.solve gives you a proper inverse matrix.

Rinv <- qr.solve(R)

## an identity matrix, as expected
R %*% Rinv
#     [,1] [,2] [,3]         [,4]
#[1,]    1    0    0 1.776357e-15
#[2,]    0    1    0 0.000000e+00
#[3,]    0    0    1 0.000000e+00
#[4,]    0    0    0 1.000000e+00

## an identity matrix, as expected
Rinv %*% R
#     [,1] [,2] [,3]         [,4]
#[1,]    1    0    0 5.293956e-23
#[2,]    0    1    0 0.000000e+00
#[3,]    0    0    1 1.387779e-17
#[4,]    0    0    0 1.000000e+00

QR factorization is numerically stable, by being less sensitive to the scale (or size, magnitude) of different columns. (Other factorizations like LU (on which solve is based) and SVD do.) By definition, this factorization does

X = Q R

If we re-scale X's columns by right multiplying a full-rank diagonal matrix D, the QR factorization does not change.

X D = Q R D

So let's look at your big matrix t(badL) to which you apply the QR factorization. I call it X.

X <- structure(c(0, -9.89189274870351e-11, 0, 0, 0, 0, 0, 9.89189274870351e-11, 
0, 0, 0, -4.23939184015843e-11, 0, -4.23939184015843e-11, 0, 
0, 0, 0, 0, 4.23939184015843e-11, 0, -1.41313127284714e-11, 4.23939184015843e-11, 
0, 0, 0, 0, 0, 0, -0.25, -0.25, 0, 0, 0, 0, 0, 0, 0, 0, 0.25, 
0, 0.25, 0, 0, 0, 0, 0, 0, 0, -5.55111512312578e-17, 0, 0, 0, 
0, 0, 5.55111512312578e-17, 0, 0, 0, -4.16333634234434e-17, 0, 
-4.16333634234434e-17, 0, 0, 0, 0, 0, 4.16333634234434e-17, 0, 
-6.93889390390723e-18, 4.16333634234434e-17, 0, 0, -2.77555756156289e-17, 
0, 0, 0, 0, 0, 2.77555756156289e-17, 0, 0, 0, -1.38777878078145e-17, 
0, -1.38777878078145e-17, 0, 0, 0, 0, 0, 1.38777878078145e-17, 
0, -6.93889390390723e-18, 1.38777878078145e-17, 0, 0, 1.11022302462516e-16, 
0, 0, 0, 0, 0, -1.11022302462516e-16, 0, 0, 0, 5.55111512312578e-17, 
0, 5.55111512312578e-17, 0, 0, 0, 0, 0, -5.55111512312578e-17, 
0, 1.38777878078145e-17, -5.55111512312578e-17, 0), dim = c(24L, 
5L))

#               [,1]  [,2]          [,3]          [,4]          [,5]
# [1,]  0.000000e+00  0.00  0.000000e+00  0.000000e+00  0.000000e+00
# [2,] -9.891893e-11  0.00 -5.551115e-17 -2.775558e-17  1.110223e-16
# [3,]  0.000000e+00  0.00  0.000000e+00  0.000000e+00  0.000000e+00
# [4,]  0.000000e+00  0.00  0.000000e+00  0.000000e+00  0.000000e+00
# [5,]  0.000000e+00  0.00  0.000000e+00  0.000000e+00  0.000000e+00
# [6,]  0.000000e+00 -0.25  0.000000e+00  0.000000e+00  0.000000e+00
# [7,]  0.000000e+00 -0.25  0.000000e+00  0.000000e+00  0.000000e+00
# [8,]  9.891893e-11  0.00  5.551115e-17  2.775558e-17 -1.110223e-16
# [9,]  0.000000e+00  0.00  0.000000e+00  0.000000e+00  0.000000e+00
#[10,]  0.000000e+00  0.00  0.000000e+00  0.000000e+00  0.000000e+00
#[11,]  0.000000e+00  0.00  0.000000e+00  0.000000e+00  0.000000e+00
#[12,] -4.239392e-11  0.00 -4.163336e-17 -1.387779e-17  5.551115e-17
#[13,]  0.000000e+00  0.00  0.000000e+00  0.000000e+00  0.000000e+00
#[14,] -4.239392e-11  0.00 -4.163336e-17 -1.387779e-17  5.551115e-17
#[15,]  0.000000e+00  0.00  0.000000e+00  0.000000e+00  0.000000e+00
#[16,]  0.000000e+00  0.25  0.000000e+00  0.000000e+00  0.000000e+00
#[17,]  0.000000e+00  0.00  0.000000e+00  0.000000e+00  0.000000e+00
#[18,]  0.000000e+00  0.25  0.000000e+00  0.000000e+00  0.000000e+00
#[19,]  0.000000e+00  0.00  0.000000e+00  0.000000e+00  0.000000e+00
#[20,]  4.239392e-11  0.00  4.163336e-17  1.387779e-17 -5.551115e-17
#[21,]  0.000000e+00  0.00  0.000000e+00  0.000000e+00  0.000000e+00
#[22,] -1.413131e-11  0.00 -6.938894e-18 -6.938894e-18  1.387779e-17
#[23,]  4.239392e-11  0.00  4.163336e-17  1.387779e-17 -5.551115e-17
#[24,]  0.000000e+00  0.00  0.000000e+00  0.000000e+00  0.000000e+00

Let's re-scale its columns so that every column has Euclidean norm (L2 norm, 2-norm) 1.

norm2 <- sqrt(colSums(X ^ 2))

XD <- X * rep(1 / norm2, each = nrow(X))

#             [,1] [,2]        [,3]       [,4]        [,5]
# [1,]  0.00000000  0.0  0.00000000  0.0000000  0.00000000
# [2,] -0.60246371  0.0 -0.48418203 -0.5714286  0.57585260
# [3,]  0.00000000  0.0  0.00000000  0.0000000  0.00000000
# [4,]  0.00000000  0.0  0.00000000  0.0000000  0.00000000
# [5,]  0.00000000  0.0  0.00000000  0.0000000  0.00000000
# [6,]  0.00000000 -0.5  0.00000000  0.0000000  0.00000000
# [7,]  0.00000000 -0.5  0.00000000  0.0000000  0.00000000
# [8,]  0.60246371  0.0  0.48418203  0.5714286 -0.57585260
# [9,]  0.00000000  0.0  0.00000000  0.0000000  0.00000000
#[10,]  0.00000000  0.0  0.00000000  0.0000000  0.00000000
#[11,]  0.00000000  0.0  0.00000000  0.0000000  0.00000000
#[12,] -0.25819930  0.0 -0.36313652 -0.2857143  0.28792630
#[13,]  0.00000000  0.0  0.00000000  0.0000000  0.00000000
#[14,] -0.25819930  0.0 -0.36313652 -0.2857143  0.28792630
#[15,]  0.00000000  0.0  0.00000000  0.0000000  0.00000000
#[16,]  0.00000000  0.5  0.00000000  0.0000000  0.00000000
#[17,]  0.00000000  0.0  0.00000000  0.0000000  0.00000000
#[18,]  0.00000000  0.5  0.00000000  0.0000000  0.00000000
#[19,]  0.00000000  0.0  0.00000000  0.0000000  0.00000000
#[20,]  0.25819930  0.0  0.36313652  0.2857143 -0.28792630
#[21,]  0.00000000  0.0  0.00000000  0.0000000  0.00000000
#[22,] -0.08606647  0.0 -0.06052275 -0.1428571  0.07198158
#[23,]  0.25819930  0.0  0.36313652  0.2857143 -0.28792630
#[24,]  0.00000000  0.0  0.00000000  0.0000000  0.00000000

What do you think now? Is it still a matrix with only one nonzero column? Although qr(X) does not actually first re-scale all columns before QR factorization, looking at XD does help you better understand why QR factorization is more robust.

If you do want to intervene, do not use zapsmall; threshold columns by their 2-norm, instead.

X0 <- X
X0[, norm2 < max(norm2) * sqrt(.Machine$double.eps)] <- 0
QR0 <- qr(X0)

QR0$rank
# [1] 1

How do we know that sqrt(.Machine$double.eps) is an appropriate threshold?

Any threshold between sqrt(.Machine$double.eps) (about 1e-8) and .Machine$double.eps (about 1e-16) is reasonable. Using .Machine$double.eps recovers the regular QR result, giving you rank 4.

The "sqrt" threshold comes from the situation where we want to look at X'X, which squares the condition number of X.

Zoochore answered 22/6, 2022 at 0:10 Comment(1)

Thanks. The point about solve() is well taken, and thresholding does seem to be the most useful suggestion. But a couple of quick questions -- 1. How do we know that sqrt(.Machine$double.eps) is an appropriate threshold? and 2. Supposing it is, isn't that evidence that those columns of X are just noise? (I'm not really preferring solve(), I'm preferring to toss out columns that look like just noise. – Gallfly 22/6, 2022 at 1:5

I would suggest that you prefer Singular Value Decomposition. It will give you the best solution possible if the matrix is rank deficient. Here's an example of how to use it in R.

Semiramis answered 22/6, 2022 at 0:41 Comment(6)

Thanks, but I don't think it is a good fit for what I am doing, which requires that I first determine the rank of a matrix of contrasts, and then construct a test statistic based on the associated subspace. – Gallfly 22/6, 2022 at 2:10

SVD can tell you what the rank of the matrix is. – Semiramis 22/6, 2022 at 11:33

You're right. See my addition to the OP for context on what I'm really trying to do. I think I heard about SVD in 1972 or so when I was a grad student. It's something I always thought I should know more about, but you can see the persistence of my reluctance to pursue it. – Gallfly 22/6, 2022 at 16:51

Dude, it's been 50 years. Dive in. – Semiramis 22/6, 2022 at 16:57

:-) OTOH, I've gotten by without it for that long... – Gallfly 22/6, 2022 at 18:54

@RussLenth It is easier said than done. SVD is also sensitive to the size of columns. Consider the 2 x 2 diagonal matrix in my answer. SVD, by discarding small singular values will report that the matrix has rank 1. I don't think there is a golden rule. It all depends on what you want to achieve. – Zoochore 23/6, 2022 at 1:15

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

More on context

Recommended topics

Hot tags