SciPy SVD vs. Numpy SVD

Asked 14/9, 2015 at 16:7 Answered 28/6, 2023 at 20:4

Both SciPy and Numpy have built in functions for singular value decomposition (SVD). The commands are basically scipy.linalg.svd and numpy.linalg.svd. What is the difference between these two? Is any of them better than the other one?

Charmeuse answered 14/9, 2015 at 16:7 Comment(1)

I don't know about the main behavior, but the scipy version has two additional options: 1) overwrite_a, which allows in-place modifications to the input and would reduce memory usage and possibly speed it up, and 2) check_finite which allows you to have the call assume the array is finite, saving some small overhead. – Feola 14/9, 2015 at 16:58

From the FAQ page, it says scipy.linalg submodule provides a more complete wrapper for the Fortran LAPACK library whereas numpy.linalg tries to be able to build independent of LAPACK.

I did some benchmarks for the different implementation of the svd functions and found scipy.linalg.svd is faster than the numpy counterpart:

However, jax wrapped numpy, aka jax.numpy.linalg.svd is even faster:

Full notebook for the benchmarks are available here.

Sheeting answered 27/9, 2019 at 15:3 Comment(3)

Thanks! I was unaware of jax. – Unswear 20/2, 2020 at 21:1

These are somewhat moving targets. On both Windows and Linux, using either OpenBLAS or MKL, the performance of NumPy and SVD are now identical. JAX may still be faster, I did not test it. – Screening 3/8, 2020 at 17:8

Would you know of any real, not random, benchmarks ? Thanks – Walcoff 18/9, 2020 at 12:54

Apart from the error checking, the actual work seems to be done within lapack both with numpy and scipy.

Without having done any benchmarking, I guess the performance should be identical.

Brogan answered 16/9, 2015 at 14:1 Comment(0)

Another distinction is that np.linalg.svd can do vectorized svd calculations over large data arrays, where sp.linalg.svd will only do 1 at a time.

ex:

import numpy as np
import scipy as sp

data = np.random.random((3,3))             # a single matrix
data_array = np.random.random((10**6,3,3)) # one million matrices

# numpy svd
R,S,V = np.linalg.svd(data)       # works
R,S,V = np.linalg.svd(data_array) # works

# scipy svd
R,S,V = sp.linalg.svd(data)       # works
R,S,V = sp.linalg.svd(data_array) # fails !!!

I have not benchmarked this, but while a direct 1:1 comparison between the two might show sp.linalg.svd to be faster to compute, np.linalg.svd might be faster (or at least more convenient) when you need to compute the svd over a large data array.

Virgy answered 28/6, 2023 at 20:4 Comment(0)

Recommended topics

Hot tags