I am using Spark Scala to calculate cosine similarity between the Dataframe rows.
Dataframe format is below
|-- SKU: double (nullable = true)
|-- Features: vector (nullable = true)
Sample of the dataframe below
| SKU| Features|
| 9970.0|[4.7143,0.0,5.785...|
| 3296.0|[4.7143,1.4286,6....|
| 1.0|[4.2308,0.7692,5....|
| 513.0|[3.0,0.0,4.9091,5...|
| 3753.0|[5.9231,0.0,4.846...|
| 2803.0|[4.2308,0.0,4.846...|
I tried to transpose the matrix and check the following mentioned links.Apache Spark Python Cosine Similarity over DataFrames, calculating-cosine-similarity-by-featurizing-the-text-into-vector-using-tf-idf But I believe there is a better solution
I am tried the below sample code
val irm = new IndexedRowMatrix(inClusters.rdd.map {
case (v,i:Vector) => IndexedRow(v, i)
But I got the below error
Error:(80, 12) constructor cannot be instantiated to expected type;
found : (T1, T2)
required: org.apache.spark.sql.Row
case (v,i:Vector) => IndexedRow(v, i)
I checked the following Link Apache Spark: How to create a matrix from a DataFrame? But can't do it using Scala