Memory Efficient Centered Sparse SVD/PCA (in Julia)?

I have a 3 million x 9 million sparse matrix with several billion non-zero entries. R and Python do not allow sparse matrices with more than MAXINT non-zero entries, thus why I found myself using Julia.

While scaling this data with the standard deviation is trivial, demeaning is of course a no-go in a naive manner as that would create a dense, 200+ terabyte matrix.

The relevant code for doing svd is julia can be found at https://github.com/JuliaLang/julia/blob/343b7f56fcc84b20cd1a9566fd548130bb883505/base/linalg/arnoldi.jl#L398

From my reading, a key element of this code is the AtA_or_AAt struct and several of the functions around those, specifically A_mul_B!. Copied below for your convenience

struct AtA_or_AAt{T,S} <: AbstractArray{T, 2}
    A::S
    buffer::Vector{T}
end

function AtA_or_AAt(A::AbstractMatrix{T}) where T
    Tnew = typeof(zero(T)/sqrt(one(T)))
    Anew = convert(AbstractMatrix{Tnew}, A)
    AtA_or_AAt{Tnew,typeof(Anew)}(Anew, Vector{Tnew}(max(size(A)...)))
end

function A_mul_B!(y::StridedVector{T}, A::AtA_or_AAt{T}, x::StridedVector{T}) where T
    if size(A.A, 1) >= size(A.A, 2)
        A_mul_B!(A.buffer, A.A, x)
        return Ac_mul_B!(y, A.A, A.buffer)
    else
        Ac_mul_B!(A.buffer, A.A, x)
        return A_mul_B!(y, A.A, A.buffer)
    end
end
size(A::AtA_or_AAt) = ntuple(i -> min(size(A.A)...), Val(2))
ishermitian(s::AtA_or_AAt) = true

This is passed into the eigs function, where some magic happens, and the output is then processed in to the relevant components for SVD.

I think the best way to make this work for a 'centering on the fly' type setup is to do something like subclass AtA_or_AAT with a AtA_or_AAT_centered version that more or less mimics the behavior but also stores the column means, and redefines the A_mul_B! function appropriately.

However, I do not use Julia very much and have run in to some difficulty modifying things already. Before I try to dive into this again, I was wondering if I could get feedback if this would be considered an appropriate plan of attack, or if there is simply a much easier way of doing SVD on such a large matrix (I haven't seen it, but I may have missed something).

edit: Instead of modifying base Julia, I've tried writing a "Centered Sparse Matrix" package that keeps the sparsity structure of the input sparse matrix, but enters the column means where appropriate in various computations. It's limited in what it has implemented, and it works. Unfortunately, it is still too slow, despite some pretty extensive efforts to try to optimize things.

After much fuddling with the sparse matrix algorithm, I realized that distributing the multiplication over the subtraction was dramatically more efficient:

If our centered matrix Ac is formed from the original nxm matrix A and its vector of column means M, with a nx1 vector of ones that I will just call 1. We are multiplying by a mxk matrix X

Ac := (A - 1M')
AcX = X
    = AX - 1M'X

And we are basically done. Stupidly simple, actually.

AX is can be carried out with the usual sparse matrix multiplication function, M'X is a dense vector-matrix inner product, and the vector of 1's "broadcasts" (to use Julia's terminology) to each row of the AX intermediate result. Most languages have a way of doing that broadcasting without realizing the extra memory allocation.

This is what I've implemented in my package for AcX and Ac'X. The resulting object can then be passed to algorithms, such as the svds function, which only depend on matrix multiplication and transpose multiplication.

Recommended topics

Hot tags