Using randomized_svd for recommendation
Asked Answered
C

1

35

I was following paper : Effective Latent Models for Binary Feedback in Recommender Systems by Maksims N. Volkovs and Guang Wei Yu.

It is for producing recommendation using model based approach, SVD by use of neighbor-hood similarity information from collaborative filtering approaches.

So basically the author instead of decomposing the user rating matrix R(M users * N songs) as we do in SVD for recommendation, said to decompose user song prediction matrix S(M users * N songs) or the sparse matrix S(M users * top-k predicted songs).

Thus we get,

Ur,$r,Vr = sklearn.utils.extmath.randomized_svd(', n_components  = 1000)

where r = SVD rank = n_components.

And we do generate prediction using Ur and Vr :

S(u, v) =  Ur(u, :) * Vr(v, :).T

where u = user, v = item v , T = transpose

I generated the S(M*top-k) matrix using collaborative filtering approach and fed to randomized_svd

But the prediction generated by above approach, does not produce accuracy(truncated mAP@500 , performance measure I am using , mAP = 0.01), while the authors have produced good mAP of 0.14 for same kaggle million song challenge data.

It is a lot to ask to read the paper and tell what's wrong,but if someone has prior knowledge and can help me that would be great.

Cullin answered 18/5, 2016 at 5:2 Comment(2)
I suggest asking this at the datascience stackexchange, a lot of recommender questions pass along there and most people have a decent Linear Algebra backgroundBombay
More suitable for the crossvalidate stackexchangeSuzette
D
1

It's a tough one without reviewing your whole project. Here are couple things you can do... 1) check to see if you are preprocessing the dataset and splitting it into training and testing sets in the same way as described in the paper. differences in data preprocessing and splitting can have a significant impact on the performance of the model 2) performance of the model can be sensitive to the choice of hyperparams. You could try experimenting with different values of n_components and see if that improves the performance.

I always hated reading those papers :-) Try contacting the authors (LinkedIn or other sources). They do respond most of the time - at least for me.

Dodson answered 4/4, 2023 at 17:39 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.