Matrix factorization for collaborative filtering - new users and items?
Asked Answered
C

1

8

I've been reading about using matrix factorization for collaborative filtering, but I can't seem to find an example that deals with adding a new user or item to the system, or having the user rate a new item. In these cases, the item-user matrix and the factorization needs to be recomputed, correct? How can this perform well with a large number of users and items? Is there a way around it?

Thank you

Chavey answered 7/10, 2012 at 8:52 Comment(2)
A couple of additional terms that might help you in your search would be "online collaborative filtering" and stochastic gradient descent. I have not used the following and it is java but you may want to check out github.com/MrChrisJohnson/CollabStream as an example of a project that might address your need.Escuage
See here for a possible solution: #41537970Multilingual
N
5

Your question has two parts: (A) How to deal with new users and items, and (B), how to deal with new interactions (e.g. ratings, clicks, etc.).

(A) There are basically 2 different strategies for dealing with new users and items (no matter whether we use matrix factorization or something else):

  1. estimating user/item features from user (demographics, surveys) or item (price, genre, textual description, categories) attributes
  2. active learning: showing new items to all users interacting with the system, or certain items to new users of the system, in a way balancing individual user experience and information gain by the system.

There are many papers in the academic literature on both problems.

(B) This is really not problematic -- incremental updates to a matrix factorization model does not have high computational costs. See for example this paper: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.165.8010&rep=rep1&type=pdf

The MyMediaLite library (disclaimer: I am the main author) supports incremental updates for several matrix factorization methods: http://ismll.de/mymedialite

Nomadic answered 10/10, 2012 at 21:20 Comment(2)
If you use a factorization algorithm such as incremental svd and hence "complete" the user x item matrix and a new customer arises under the scenarios 1) they have some ratings or 2) they have no ratings how would you "score" them, without re-running the entire svd? Under scenario 1 could you fall back to performing an SVD (not incremental, but standard svd) on the "completed" matrix and then use a similarity measure to see which users they are closest to and use the entries in the completed matrix to make recommendations?Vandervelde
If you don't mind, one more question since you seem to be an expert in this field: Can the incremental SVD (Simon Funk) be used for binary data (customer purchased or not) or does something else need to be used? Thanks!Vandervelde

© 2022 - 2024 — McMap. All rights reserved.