Calculating eigen values of very large sparse matrices in python
Asked Answered
S

1

20

I have a very large sparse matrix which represents a transition martix in a Markov Chain, i.e. the sum of each row of the matrix equals one and I'm interested in finding the first eigenvalue and its corresponding vector which is smaller than one. I know that the eigenvalues are bounded in the section [-1, 1] and they are all real (non-complex).
I am trying to calculate the values using python's scipy.sparse.eigs function, however, one of the parameters of the functions is the number of eigenvalues/vectors to estimate and every time I've increased the number of parameters to estimate, the numbers of eigenvalues which are exactly one grew as well.
Needless to say, I am using the which parameter with the value 'LR' in order to get the k largest eigenvalues, with k being the number of values to estimate.
Does anyone have an idea how to solve this problem (finding the first eigenvalue smaller than one and its corresponding vector)?

Salvation answered 24/8, 2015 at 14:32 Comment(13)
You may have to study the documentation for the underlying ARPACK code.Homerhomere
@hpaulj, I've already done that, didn't help muchSalvation
Do you understand the problem, and matrix, well enough to know whether the are multiple eigs with this value? In other words, is this a real property of the matrix, or an error in the code?Homerhomere
@hpaulj, I understand the problem and matrix structure quite well. There are supposed to be multiple occurrences of 1 as an eigenvalue, this indicates the number of subset I can divide my graph/markov chain into. Part of my problem is that I don't know for every data set that number in advance.Salvation
You can probably first split your chain into strongly connected components (cf #21309348) and then compute the second largest eigenvalue for each component. IIUC, the answer you are then looking for is then the largest of these. (For general matrix, such a problem would be fairly difficult numerically, but the Markov chain structure of the matrix makes it much easier.)Torsi
@pv., just to make sure, I've understood your advice: You are suggesting I'll split my chain into multiple smaller ones and attempt to find the eigen vector for each individual sub-chain?Salvation
Any chance you could mathematically use SVD instead for what you are doing?Commentator
@Paul, no, the eigenvalues from SVD are the squared ones of the original matrix. Since the eigenvalues in my case are between -1 and 1, I can't use SVD,..Salvation
Just curious, how large is "very large" (the dimensions of your matrix)?Deaver
@JimRaynor about 7 million by 7 million or even larger... The problem I describe happens even with smaller matrices. I debugged it on a 8000 by 8000 oneSalvation
Perhaps you should report a bug to scipy.Postpositive
@qarma, I think it is not a bug of scipy. As far as I can tell scipy is using a 3rd party library (ARPACK) for the actual calculationsSalvation
Let's just say they are more fit to help you :)Postpositive
Q
1

I agree with @pv. If your matrix S was symmetric, you could see it as a laplacian matrix of the matrix I - S. The number of connected components of I - S is the number of zero-eigenvalues of this matrix (i.e, the dimension of the space associated to eigenvalue 1 of S). You could check the number of connected components of the graph whose similarity matrix is I - S*S' for a start, e.g. with scipy.sparse.csgraph.connected_components.

Quathlamba answered 19/1, 2016 at 10:11 Comment(1)
and what do i do if the matrix is not symmetric?Salvation

© 2022 - 2024 — McMap. All rights reserved.