Sparse Vector vs Dense Vector

Asked 20/7, 2015 at 17:37 Answered 27/9, 2022 at 6:0

How to create SparseVector and dense Vector representations

if the DenseVector is:

denseV = np.array([0., 3., 0., 4.])

What will be the Sparse Vector representation ?

Suffix answered 20/7, 2015 at 17:37 Comment(1)

For those who read the title of "Sparse Vector vs Dense Vector" and were looking for an explanation of when to use which, this answer has the information you're looking for. – Lazy 28/7, 2016 at 14:40

Unless I have thoroughly misunderstood your doubt, the MLlib data type documentation illustrates this quite clearly:

import org.apache.spark.mllib.linalg.Vector;
import org.apache.spark.mllib.linalg.Vectors;

// Create a dense vector (1.0, 0.0, 3.0).
Vector dv = Vectors.dense(1.0, 0.0, 3.0);
// Create a sparse vector (1.0, 0.0, 3.0) by specifying its indices and values corresponding to nonzero entries.
Vector sv = Vectors.sparse(3, new int[] {0, 2}, new double[] {1.0, 3.0});

Where the second argument of Vectors.sparse is an array of the indices, and the third argument is the array of the actual values in those indices.

Malena answered 20/7, 2015 at 18:34 Comment(4)

Oh, I was not passing the right count of indices. SparseV = SparseVector(4, [0, 1, 2, 3], [0., 3., 0., 4.]) – Suffix 21/7, 2015 at 17:0

what is the significance of a dot after number ie 1. ? – Suffix 24/7, 2015 at 17:46

The dot just indicates a floating point type. 1. is equivalent to 1.0 – Countersignature 20/10, 2016 at 0:30

@MohitShah (i) It is literally the first code example on the linked documentation, and (ii) the answer also includes the example showing exactly how to create a sparse vector. – Malena 26/4, 2019 at 12:2

Sparse vectors are when you have a lot of values in the vector as zero. While a dense vector is when most of the values in the vector are non zero.

If you have to create a sparse vector from the dense vector you specified, use the following syntax:

import org.apache.spark.mllib.linalg.Vector;
import org.apache.spark.mllib.linalg.Vectors;

Vector sparseVector = Vectors.sparse(4, new int[] {1, 3}, new double[] {3.0, 4.0});

Ronnieronny answered 13/4, 2017 at 10:17 Comment(0)

Dense: Use it when you are having high probability of data. sparse: Use it when you are having less available data positions filled (i.e. you are having too many zeroes) eg: {0.0,3.0,0.0,4.0} for different Vectors it will be

val posVector = Vector.dense(0.0, 3.0, 0.0, 4.0)  // all data will be in dense

val sparseVector = Vector.sparse(4, Array(1, 3), Array(3.0, 4.0)) //only non-zeros are mentioned

Syntax ex: Vector.sparse(size of vector, non-zero-index, values)

Jenniejennifer answered 27/9, 2022 at 6:0 Comment(0)

Recommended topics

Hot tags