Storing a large but low-rank matrix efficiently
Asked Answered
H

2

7

I know there are packages in R to store sparse matrices efficiently. Is there also a way to store a low-rank matrix efficiently? For example:

A <- matrix(rnorm(1e6), nrow=1e5, ncol=1e1)
B <- A %*% t(A)

Now, B is too large to store in memory, but it is low in rank. Is there any way to construct and store B in an efficient way, such that some basic read methods (rowSums, colSums, etc) are performed on the fly, in order to trade for cpu or memory?

Hypogenous answered 14/3, 2013 at 0:46 Comment(6)
Interesting question- what applications would it have? (Where do low rank matrices generally appear?)Negativism
@DavidRobinson: Those matrices are used, for instance, as approximations of large dense matrices (too large to compute, or even to store), in some optimization algorithms.Wines
If you are willing to approximate B, could you use a low-dimensional approximation, e.g. use a SVD and keep the first n dimensions of the SVD? Not sure this is quite what you want, but might be worth considering.Coati
While it doesn't answer your question, the following seems somewhat relevant: mathoverflow.net/questions/92328/low-rank-matrix-factorizationRampage
Yes, I agree with above comment. Factor it, and it will become sparse.Eleanoraeleanore
This might be of help: stanford.edu/class/ee378b/papers/…Gravitation
O
2

Your question is already the answer: To store such a low rank matrix efficiently, you make a data structure containing both factors. If matrix-vector multiplication is required, it can be done from right to left using matrix-vector products of the factors.

One application of this strategy and data structure can be found in implementations of limited-memory Broyden or BFGS quasi-Newton methods.

Oletta answered 13/2, 2014 at 14:29 Comment(0)
T
0

Here's another approach, although I miss the experience to know how efficient this would be for large matrices:

If the rank is low, it means the matrix contains many irrelevant lines, which are linear combinations of others. If the matrix represents a linear system of equations, one could design an algorithm, which successively removes those lines.

To check if a line is irrelevant, check if the rank of the matrix without that line is still the same. For computing the matrix rank, see this and that answer.

Toxic answered 28/3, 2013 at 10:10 Comment(2)
That basically sounds like a very expensive way of factoring :-)Hypogenous
Sorry, but a terribly poor idea that works only for a simple case, with replicated rows. In fact, it is TRIVIAL to generate a matrix of rank 1, that has NO replicate rows or columns. Thus pick random row vectors U and V, then U'*V has rank 1.Gravitation

© 2022 - 2024 — McMap. All rights reserved.