Consider I have a website where I've got a bunch of articles and people can vote on the articles they like.
I want to be able to query to get the articles with the most votes within a certain time (last hour, last day, last week) ordered by the number of votes.
As usual with MongoDB there are several different ways to implement this, but I am not sure which one is correct.
- A posts document, which contains an Array of Votes - Votes themselves are documents containing userid, username and vote date:
{
"_id": "ObjectId(xxxx)",
"title": "Post Title",
"postdate": "21/02/2012+1345",
"summary": "Summary of Article",
"Votes": [
{
"userid":ObjectId(xxxx),
"username": "Joe Smith",
"votedate": "03/03/2012+1436"
},
]
}
- A separate votes collection, containing details of an individual vote and a reference to the post that was voted on:
{
"_id": "ObjectId(xxxx)",
"postId": ObjectId(xxxx),
"userId": ObjectId(xxxx),
"votedate": "03/03/2012+1436"
}
The first one is more Documentey but I have no idea how to query the votes array to get the documents with the most votes in the last 24 hours.
I'm leaning towards the second one as it would be easier to query the vote count grouped by vote I think, but I'm not sure how well it would perform. This is how you'd do it in Relational Databases, but it doesn't seem very documenty - but I'm not sure if its a problem, is it?
Or do I use a combination of the two? Also would I do this type of aggregate query in real-time, every page load. Or do I just run the query say once per minute and store the results in a query result collection?
How would you implement this schema?