Best way to model a voting system in MongoDB
Asked Answered
R

2

13

I'm trying to model a voting system in MongoDB. You could imagine it as a voting system similar to reddit. Requirements:

  1. Votes are connected to objects
  2. It is very fast to check if a user has voted on an object. The application will need to know if the logged in user has voted on an object while it loops through a list of objects rendering vote buttons.
  3. Most importantly, it must be able to retrieve objects ordered by their aggregate scores over a given time period (last hour, day, month, etc) with reasonable performance.
  4. Should be able to support thousands of votes per object.

I see two approaches here (correct me if I'm wrong!):

  1. Embed an array of vote documents in each object. I'd probably store the ObjectId of the user that voted, the vote amount, and the vote time. The voterId would be the key for each embedded vote document in the votes array to allow for a quick hash lookup.
  2. Keep a separate votes collection with votes that reference objects.

I've also played with the idea of embedding votes into 'buckets' grouped by hour in a separate collection.

No. 1 would be very fast for requirement No. 2 but I don't know if requirement No. 3 is even possible in this scenario.

No. 2 would be slower for requirement No. 2 and I'm not sure what the performance would be like for requirement No. 3 / how it would be achieved (map reduce?).

Basically it seems like I need to start with a reasonably fast solution for requirement No. 3, and then make sure that requirement No 2 is not too slow. Ideas?


Potential Solution

Use embedded method. Add a parameter to each object for hourly-score, daily-score, monthly-score, etc. Add another boolean parameter recently-voted, recent-hourly, and recent-daily. Create a script that runs a map-reduce on objects to calculate and update these parameters.

The script would be run via cron in three variations.

  1. 10 minute interval: Calculate hourly-score for objects with a previous hourly-score > 0 OR objects that have recently-voted = true. Set recently-voted = false after running this script. Set recent-hourly = true.
  2. 3 hour interval: calculate daily-score for any objects that have recent-hourly = true. Set recent-hourly = false. Set recent-daily = true.
  3. 24 hour interval: calculate monthly-score for any objects that have recent-daily = true. Set recent-daily = false.

The idea is to minimize unnecessary processing on objects that aren't relevant to the score calculation script being run (hourly should only be run on objects that have been voted on since the last time hourly was run, or objects that have not been voted on and need to be reset to 0). Another nice benefit is the *-score values don't just have to be calculated based on the object votes. You could include page views for example, or whatever. Thoughts on this approach?

Rayraya answered 12/8, 2011 at 21:22 Comment(2)
Great question. I've used mongoDB and could give you a probably poor solution but I'm looking forward to seeing what ideas people come up with.Franko
I'm sure you are trying to do this in Mongo for a good reason, but given the requirements, I would think a Kimball-modelled star-schema (for faster ad-hoc reporting) in a traditional relational database with traditional data warehouse indexing strategies would outperform it (and just the star-model as the main data model, no transferring from a traditional normalized model to a data warehouse or anything like that), and without any need for regular cron jobs - handling #3 very well (which is what star-schemas are good at)Buffoon
B
2

Check out the Voting with Atomic Operators recipe in the Mongo Cookbook: http://cookbook.mongodb.org/patterns/votes/. It doesn't tell you how to implement aggregation, but you could perhaps do that by making stand-in objects that represent the objects to vote on, but for a specific time period.

Behnken answered 15/8, 2011 at 18:23 Comment(3)
Hmm yeah I've read that article and a few others. They all suggest solution No. 1 (embedded). They don't solve the trickier aggregation issue though. I cannot think of a reasonably fast way to do it with mongo ad-hoc. Background processing via cron / map-reduce is all I've come up with so far.Rayraya
I am in the same story. But found your answer helpful, cause there are many patterns related to it :) thanks mate.Polyadelphous
The cookbook seems to be gone now, but the page is still in the Wayback Machine: web.archive.org/web/20130620101300/http://cookbook.mongodb.org/… (and the cookbook source is still on github: github.com/mongodb/cookbook)Twice
H
0

If you are using ruby, there is a votable_mongo plugin for Mongoid & MongoMapper.

Homochromous answered 13/8, 2011 at 6:48 Comment(2)
I'm using PHP but the question is the more general question of how to organize this type of system in mongodb. I'll look into how votable_mongo does it to see if there's a potential answer there though.Rayraya
Ok I took a look. It's a version of solution No. 1 above. It doesn't seem to offer a viable solution to requirement No. 3 (the most important/difficult requirement).Rayraya

© 2022 - 2024 — McMap. All rights reserved.