condition number of sparse matrix
Asked Answered
S

3

5

I'm trying to obtain the condition number of a scipy sparse matrix. The way I managed to do it so far is by converting the matrix to dense, and then obtaining the eigenvalues of it:

$ python
Python 3.5.2 (v3.5.2:4def2a2901a5, Jun 26 2016, 10:47:25) 
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from numpy import array
>>> import numpy as np
>>> import scipy.sparse as sparse
>>> I = array([0,3,1,0])
>>> J = array([0,3,1,2])
>>> V = array([4,5,7,9])
>>> A = sparse.coo_matrix((V,(I,J)),shape=(4,4))
>>> A = A.todense()
>>> eig = np.linalg.eig(A)
>>> eig = eig[0].real, np.array(eig[1].real)
>>> def split(array, cond):
...     return (array[cond], array[~cond])
... 
>>> eigv, zero = split(eig[0], eig[0]>1e-10)
>>> cond = max(eigv) / min(eigv)
>>> cond
1.75

As you may expect, this becomes unfeasible for large matrices. I was wondering how this is properly done in Python?

Segarra answered 7/4, 2017 at 15:7 Comment(10)
Tried this?Node
No, but isn't that working also on numpy dense arrays?Segarra
Have you studied the sparse linear algebra documentation?Hydrastine
I think it is. Probably still faster than your method, at least worth a try.Node
This might be something. You can specify to calculate only the largest/smallest eigenvalues (according to amgnitude).Node
@Hydrastine I did search the documentation, and I couldn't find a single function that computes the condition number. It seems only the scipy.sparse.linalg.lsmr returns it, but I'm not interested in solving the system.Segarra
@Michael it is actually faster, but I'm still converting it to a dense array so it is not scalable either.Segarra
@Michael your last comment may be useful!Segarra
The question has been asked for sparse matrices in general. scicomp.stackexchange.com/q/15864. Apparently you can only estimate it, and even that can be expensive. researchgate.net/post/…Hydrastine
Just to finalize this question, this link solved my problem.Segarra
A
4

As Augusto Sisa mentions the condition number is defined as

cond(A):=||A|| ||A^{-1}||

but this is essential the ratio of the magnitude of the largest to the magnitude of the smallest (in magnitude) eigenvalue. So it makes more sense to get those to values using scipy.sparse.linalg.eigs() Scipy reference manual and find out yourself.

import scipy.sparse
import scipy.sparse.linalg as lg

ew1, ev = lg.eigsh(A, which='LM')
ew2, ev = lg.eigsh(A, sigma=1e-8)   #<--- takes a long time

ew1 = abs(ew1)
ew2 = abs(ew2)

condA = ew1.max()/ew2.min()

The sigma option is specified because the default options don't do a good job finding the smallest eigenvalues. This option computes the eigenvalues near sigma in shift-inverse mode. This step takes a long time. You can speed it up at the expense of accuracy by specifying k= and ncv= options to values smaller than the defaults. Eg k=3 and ncv=8. But you don't really know if you will get a good approximation of those eigenvalues like that. You will almost always improve the accuracy by using the k argument to compute more eigenvalues, but defaults should be accurate enough for most purposes. Test it yourself to see the difference for your matrix.

In general one uses sparse matrices because the dimensions of the matrix make storing all entries prohibitive. If the data storage is prohibitive for the given dimensions, then for general matrices, much worse is the computation of the inverse. The computation of the inverse is itself prohibitive, but a sparse matrix does not generally have a sparse inverse so your stuck again with the first problem.

Antibiotic answered 7/6, 2020 at 10:22 Comment(0)
S
2

As you mention, convert a sparse matrix to dense matrix is not usually a good idea for large problems. So, you shouldn't use functions like numpy.linalg.cond() function that is useful for dense problems.

cond = || A || * || A^-1 ||

Although maybe it is not the more efficient way, you can use functions normand inverse in sparse module of scipy.sparse to evaluate the condition number (Invert a matrix is a computationally expensive process):

norm_A = scipy.sparse.linalg.norm(A)
norm_invA = scipy.sparse.linalg.norm(scipy.sparse.linalg.inv(A))
cond = norm_A*norm_invA

Example:

import numpy as np
import scipy.sparse as sparse

n = 7
diagonals = np.array([[1, -4, 6, -4, 1]]).repeat(n, axis=0).transpose()
offsets = np.array([-2, -1, 0, 1, 2])
S = sparse.dia_matrix((diagonals, offsets), shape=(n, n))
norm_S = sparse.linalg.norm(S, np.inf)
norm_invS = sparse.linalg.norm(sparse.linalg.inv(S), np.inf)
cond = norm_S*norm_invS
Salesin answered 1/5, 2018 at 0:57 Comment(0)
T
1

The conditional number is a metric that for example allows to bound the relative error in the solution of a linear system. A good introduction in my opinion is given in these lecture slides: https://www.cs.tau.ac.il/~dcor/Graphics/adv-slides/Solving.pdf

But allready Wikipedia is a great resource https://en.wikipedia.org/wiki/Condition_number

First notice that using the eigenvalues to compute the conditional number requires A to be SPD. For a general (even non-square) matrix it is the ration of the smallest to biggest singular value.

import scipy.sparse as spa

sv_max = spa.linalg.svds(A, return_singular_vectors=False, k=1, which='LM')[0]
sv_min = spa.linalg.svds(A, return_singular_vectors=False, k=1, which='SM')[0]
cond = sv_max/sv_min

svds computes only the partial singular values. In the first call we only calculate one (k=1) sv with "Largest Magnitude". The second call only calculates the sv with "Smallest Magnitude".

As only two singular vectors get computed and further we do not calculate the corresponding orthonormal vectors: we can at least assume this code to be very efficient.

Tagliatelle answered 27/5 at 10:51 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.