How to acquire or generate test data for a recommender system
Asked Answered
B

2

10

I'm currently researching recommender systems and would like to know how other researchers acquire or generate test data to evaluate the systems' performance?

Bombay answered 9/3, 2012 at 20:37 Comment(0)
S
8

When I was working with Recommender Systems I had the exact same problem. I enjoyed the Grouplens dataset the most:

http://grouplens.org/node/12

You can download ratings given by users to movies.

Also, I described in my blog some datasets I found while researching:

http://girlincomputerscience.blogspot.com.br/2010/12/datasets.html

Hope it helps!

Symphysis answered 2/11, 2012 at 19:48 Comment(0)
K
7

I don't know what field you're evaluating, but if it's movie recommendations, you could use the MovieLens data from GroupLens to start out with. (It seems like their site is temporarily down, but I'm sure it will be back up soon).

They have three sets of data - 100,000 votes (preferences), 1 million, and 10 million - and it seems like they're more or less the standard that everyone starts out with.

Kerrikerrie answered 12/3, 2012 at 13:46 Comment(2)
Awesome! Thanks for the info. What if people were looking for a data set that was item based rather than rating based? E.G. Collaborative filtering vs contentfiltering/itemfiltering/info retrieval.Bombay
What do you mean? The Grouplens set can be used for collaborative filtering, too.Kerrikerrie

© 2022 - 2024 — McMap. All rights reserved.