Efficiently accumulating a collection of sparse scipy matrices

Asked 28/6, 2012 at 23:22 Answered 10/10, 2022 at 8:53

Solved python optimization numpy scipy sparse-matrix

I've got a collection of O(N) NxN scipy.sparse.csr_matrix, and each sparse matrix has on the order of N elements set. I want to add all these matrices together to get a regular NxN numpy array. (N is on the order of 1000). The arrangement of non-zero elements within the matrices is such that the resulting sum certainly isn't sparse (virtually no zero elements left in fact).

At the moment I'm just doing

reduce(lambda x,y: x+y,[m.toarray() for m in my_sparse_matrices])

which works but is a bit slow: of course the sheer amount of pointless processing of zeros which is going on there is absolutely horrific.

Is there a better way ? There's nothing obvious to me in the docs.

Update: as per user545424's suggestion, I tried the alternative scheme of summing the sparse matrices, and also summing sparse matrices onto a dense matrix. The code below shows all approaches to run in comparable time (Python 2.6.6 on amd64 Debian/Squeeze on a quad-core i7)

import numpy as np
import numpy.random
import scipy
import scipy.sparse
import time

N=768
S=768
D=3

def mkrandomsparse():
    m=np.zeros((S,S),dtype=np.float32)
    r=np.random.random_integers(0,S-1,D*S)
    c=np.random.random_integers(0,S-1,D*S)
    for e in zip(r,c):
        m[e[0],e[1]]=1.0
    return scipy.sparse.csr_matrix(m)

M=[mkrandomsparse() for i in xrange(N)]

def plus_dense():
    return reduce(lambda x,y: x+y,[m.toarray() for m in M])

def plus_sparse():
    return reduce(lambda x,y: x+y,M).toarray()

def sum_dense():
    return sum([m.toarray() for m in M])

def sum_sparse():
    return sum(M[1:],M[0]).toarray()

def sum_combo():  # Sum the sparse matrices 'onto' a dense matrix?
    return sum(M,np.zeros((S,S),dtype=np.float32))

def benchmark(fn):
    t0=time.time()
    fn()
    t1=time.time()
    print "{0:16}:  {1:.3f}s".format(fn.__name__,t1-t0)

for i in xrange(4):
    benchmark(plus_dense)
    benchmark(plus_sparse)
    benchmark(sum_dense)
    benchmark(sum_sparse)
    benchmark(sum_combo)
    print

and logs out

plus_dense      :  1.368s
plus_sparse     :  1.405s
sum_dense       :  1.368s
sum_sparse      :  1.406s
sum_combo       :  1.039s

although you can get one approach or the other to come out ahead by a factor of 2 or so by messing with N,S,D parameters... but nothing like the order of magnitude improvement you'd hope to see from considering the number of zero adds it should be possible to skip.

Nasal answered 28/6, 2012 at 23:22 Comment(0)

I think I've found a way to speed it up by a factor of ~10 if your matrices are very sparse.

In [1]: from scipy.sparse import csr_matrix

In [2]: def sum_sparse(m):
   ...:     x = np.zeros(m[0].shape)
   ...:     for a in m:
   ...:         ri = np.repeat(np.arange(a.shape[0]),np.diff(a.indptr))
   ...:         x[ri,a.indices] += a.data
   ...:     return x
   ...: 

In [6]: m = [np.zeros((100,100)) for i in range(1000)]

In [7]: for x in m:
   ...:     x.ravel()[np.random.randint(0,x.size,10)] = 1.0
   ...:     

        m = [csr_matrix(x) for x in m]

In [17]: (sum(m[1:],m[0]).todense() == sum_sparse(m)).all()
Out[17]: True

In [18]: %timeit sum(m[1:],m[0]).todense()
10 loops, best of 3: 145 ms per loop

In [19]: %timeit sum_sparse(m)
100 loops, best of 3: 18.5 ms per loop

Adalineadall answered 29/6, 2012 at 17:37 Comment(3)

Ah, excellent! This is the sort of efficient algorithm I'd expect; just a shame it doesn't seem to be provided already as an even more efficient "builtin". Will try it out soon... – Nasal 29/6, 2012 at 18:37

Yup, depends a bit on the density but x10 speed improvement is typical for the sort of numbers I'm interested in. – Nasal 29/6, 2012 at 20:27

Amazing. I've just been applying this same pattern in a few other places where I have sparse-dense interactions - typically dot product type things - and getting substantial speed-ups (x2-x3) every time. – Nasal 29/6, 2012 at 21:39

@user545424 has already posted what will likely be the fastest solution. Something in the same spirit that is more readable and ~same speed... nonzero() has all kinds of useful applications.

def sum_sparse(m):
        x = np.zeros(m[0].shape,m[0].dtype)
        for a in m:
            # old lines
            #ri = np.repeat(np.arange(a.shape[0]),np.diff(a.indptr))
            #x[ri,a.indices] += a.data
            # new line
            x[a.nonzero()] += a.data
        return x

Merrill answered 2/7, 2012 at 22:47 Comment(1)

it's elegant and beautiful! – Stocky 7/2, 2023 at 2:42

Can't you just add them together before converting to a dense matrix?

>>> sum(my_sparse_matrices[1:],my_sparse_matrices[0]).todense()

Adalineadall answered 29/6, 2012 at 0:22 Comment(1)

Tried this (see updated question) but it's not a massive (if at all) speed gain, probably because it becomes a complex thing to do as the intermediate results becomes more dense. I did have some hope summing sparse matrices onto a (initially zero) dense matrix would be more efficient, but it doesn't seem to be so. – Nasal 29/6, 2012 at 13:45

This is not a complete answer (and I too would like to see a more detailed response), but you can gain an easy factor of two or more improvement by not creating intermediate results:

def sum_dense():
    return sum([m.toarray() for m in M])

def sum_dense2():
    return sum((m.toarray() for m in M))

On my machine (YMMV), this results in being the fastest computation. By placing the summation in a () rather than a [], we construct a generator rather than the whole list before the summation.

Coachwork answered 29/6, 2012 at 14:31 Comment(2)

Thanks; hadn't come across "generator expressions" before python.org/dev/peps/pep-0289 . Only gets me a small improvement (~25%) in my test cases, but will certainly be using these more. – Nasal 29/6, 2012 at 19:42

@Nasal The improvement noted is on the comparison of sum_dense to sum_dense2, not against the other methods. If you are going to be making comparisons between algorithms, you should not penalize a particular choice because of bad implementation (in this case you are needlessly copying the array). – Coachwork 29/6, 2012 at 19:52

Convert to 2D-array and use built-in multiplication of sparse matrix. This is faster than @user545424 's method.

import numpy as np
from scipy.sparse import csr_matrix

m = [np.zeros((100,100)) for i in range(1000)]
for x in m:
   x.ravel()[np.random.randint(0,x.size,10)] = 1.0

m = [csr_matrix(x) for x in m]

def sum_sparse(m):
     x = np.zeros(m[0].shape)
     for a in m:
         ri = np.repeat(np.arange(a.shape[0]),np.diff(a.indptr))
         x[ri,a.indices] += a.data
     return x

def sum_sparse2(m):
    n_idx = []
    count = 0
    data = []
    indptr = [0]
    for a in m:
        b = a.indptr
        c = np.repeat(np.arange(b.shape[0]-1), b[1:] - b[:-1])
        n_idx.append(np.ravel_multi_index((c,a.indices), dims=a.shape))
        data.append(a.data)
        count += len(a.indices)
        indptr.append(count)
    
    data = np.concatenate(data)
    indptr = np.array(indptr)
    n_idx = np.concatenate(n_idx)
    mc = csr_matrix((data, n_idx, indptr), shape=(1000,100*100))
    
    res_sum = (np.ones(1000) @ mc).reshape((100,100))
    return res_sum

%timeit -r 10 sum_sparse2(m)
#6.46 ms ± 145 µs per loop (mean ± std. dev. of 10 runs, 100 loops each)

%timeit -r 10 sum_sparse(m)
#10.3 ms ± 114 µs per loop (mean ± std. dev. of 10 runs, 100 loops each)

Effeminate answered 10/10, 2022 at 8:53 Comment(0)

Recommended topics

Hot tags