Java implementation of singular value decomposition for large sparse matrices
Asked Answered
V

1

8

I'm just wondering if anyone out there knows of a java implementation of singular value decomposition (SVD) for large sparse matrices? I need this implementation for latent semantic analysis (LSA).

I tried the packages from UJMP and JAMA but they choke when the number of row >= 1000 and col >= 500. If anyone can point me to psuedocode or something out there, that would be greatly appreciated.

Virg answered 25/7, 2011 at 17:28 Comment(5)
The answer in another question which was nearly the same was to try Colt.Luker
well, the code for the class SingularValueDecomposition in both Colt and JAMA are nearly identical. moreover, the code only works for when m > n (number of rows is greater than number of columns). also, i think the algorithms are not optimized for sparse matrices.Virg
the m > n condition doesn't really bother me. in fact, for me, 99.99% of the time, m > n, will always be the case (rows represent words and cols represent documents). it's just that this constraint isn't clearly documented.Virg
where m = 2810, n = 2809, it took 25 minutes using colt. not bad.Virg
look at la4j.decomposition. It might help.Fula
S
4

There's a list of Java numerical libraries at Wikipedia. The NIST library, which is quite good, unfortunately does not deal with sparse matrices. I'm not too familiar with the other packages. You might take a look at Colt; it's also quite high quality and does handle sparse matrices for some operations; I don't know about SVD, although I imagine it does. I've also heard that UJMP is also worth a look.

EDIT: Sorry to hear that UJMP doesn't handle your problem. I had heard that it was worth a look.

Sather answered 25/7, 2011 at 17:38 Comment(2)
thanks. that list helped. if anyone is interested, the apache commons math package does have a SVD implementation. it iterates only 30 times, and throws an exception. digging a little deeper into the code, it is not apparent how to increase this (as there are classes within classes within classes).Virg
It is using Jama svd in back.Fula

© 2022 - 2024 — McMap. All rights reserved.