How to merge a collection of ordered preferences
Asked Answered
B

4

21

I have a set of r reviewers who are rating a set of n objects. Each reviewer independently produces an ordered list of the objects he or she chooses to rank. The goal is to produce one list that is the collation of the various ordered lists. We can assume that each reviewer's point of view is equally weighted.

This differs from most merging and ordered list questions in that there is no global ordering. One reviewer can rate A > B while another can rate B > A. As mentioned, each object is not necessarily rated by each reviewer.

My current thought is to decompose each reviewer's list into a set of ordered tuples for each of the m * (m-1) * .5 unique pairs of entries in the list, where m is the number of objects rated. Now take all the tuples from all reviewers. For a given combination (a,b) find all such tuples and take the majority vote (of those voting) as the determinator of whether a < b.

Now I have a set of ordered tuples that represents the wisdom of all. But how do I turn these into one ordered list? I can start with a randomly chosen pair of objects, and order them, then add another one in the right order, but the output will depend on which one I choose to start with. Also there may be loops.

I'd appreciate any ideas.

Bagwell answered 15/6, 2011 at 2:5 Comment(1)
This relates somewhat to a question of my own: #22571138Whalen
W
15

A solution that seems elegant and still does what it needs to do would be to convert each ordering into a score from 1 to 0, where 1 is the first (top) ranked item in a given reviewer's list and 0 is their last (bottom item) and all items in between get a linearly scaled score. So if reviewer 1 ranks just 3 items, they would get scores for that list of 1, 0.5, and 0. Then you simply take the mean score of every item to generate a collated list. Ties could be broken by the number of "reviews" for an item (so that an item unanimously marked as best by 3 reviewers would appear higher in the final list than an item unanimously marked as best by 2 reviewers, etc.)

Your requirement "The goal is to produce one list that is the collation of the various ordered lists. We can assume that each reviewer's point of view is equally weighted." is definitely met by this simple algorithm, but often problems like this have a bit more complex requirements once you dig into them.

Woodpecker answered 15/6, 2011 at 2:11 Comment(4)
This is great. I was originally on the path of averaging ratings, but couldn't figure out how to normalize things. Your scheme elegantly and simply solves this. At least I think this achieves the desired goals. Now I'll try it on the data and see how it feels.Bagwell
Oops. just realized why this isn't ideal. Let's say there are 20 items and reviewer A ranks them all. But reviewer B only ranks 5. In your scheme, B's top choice is equivalent to A's top and ditto for bottom. But it's possible that the range of B's 5 items is really in the middle, or in the bottom quarter. This is why I had moved on to the partial ordering approach I tried to outline in my original post.Bagwell
I tried something similar, but the score is (rank - 1) / (items - 1), so the top ranked item has a score of 0 and the last item a score of 1. Additionally, non-ranked items also get a score of 1, as if they were all tied for last place. I think it evens things a bit, in that an item ranked high by only one reviewer that ranked may items is not pulled too high compared to an item that was ranked fairly high by several reviewers. Of course, it’s to rank video games, so the stakes aren’t high.Alie
Or even (rank - 1) / items so that the last item has a score slightly below 1, and the non-ranked items have a score of 1.Alie
Y
42

The problem space you're navigating is essentially (isomorphic to) a subset of voting theory, that part where both ballots and outcomes are ordered lists of candidates.

You might benefit from reading the following:

Based on my knowledge of voting theory, I'll give you a recommendation: if you reasonably believe that an O(n3) algorithm is workable for your particular data set, try out the Schulze Method. Otherwise, Borda Count is the only listed method which runs in O(n) time and takes a ranking as ballot inputs, so stick with that.

Yama answered 21/3, 2014 at 21:16 Comment(1)
Thanks. I learnt a lot of new things from the list of algorithms that you mentioned!Mont
W
15

A solution that seems elegant and still does what it needs to do would be to convert each ordering into a score from 1 to 0, where 1 is the first (top) ranked item in a given reviewer's list and 0 is their last (bottom item) and all items in between get a linearly scaled score. So if reviewer 1 ranks just 3 items, they would get scores for that list of 1, 0.5, and 0. Then you simply take the mean score of every item to generate a collated list. Ties could be broken by the number of "reviews" for an item (so that an item unanimously marked as best by 3 reviewers would appear higher in the final list than an item unanimously marked as best by 2 reviewers, etc.)

Your requirement "The goal is to produce one list that is the collation of the various ordered lists. We can assume that each reviewer's point of view is equally weighted." is definitely met by this simple algorithm, but often problems like this have a bit more complex requirements once you dig into them.

Woodpecker answered 15/6, 2011 at 2:11 Comment(4)
This is great. I was originally on the path of averaging ratings, but couldn't figure out how to normalize things. Your scheme elegantly and simply solves this. At least I think this achieves the desired goals. Now I'll try it on the data and see how it feels.Bagwell
Oops. just realized why this isn't ideal. Let's say there are 20 items and reviewer A ranks them all. But reviewer B only ranks 5. In your scheme, B's top choice is equivalent to A's top and ditto for bottom. But it's possible that the range of B's 5 items is really in the middle, or in the bottom quarter. This is why I had moved on to the partial ordering approach I tried to outline in my original post.Bagwell
I tried something similar, but the score is (rank - 1) / (items - 1), so the top ranked item has a score of 0 and the last item a score of 1. Additionally, non-ranked items also get a score of 1, as if they were all tied for last place. I think it evens things a bit, in that an item ranked high by only one reviewer that ranked may items is not pulled too high compared to an item that was ranked fairly high by several reviewers. Of course, it’s to rank video games, so the stakes aren’t high.Alie
Or even (rank - 1) / items so that the last item has a score slightly below 1, and the non-ranked items have a score of 1.Alie
C
5

Merging rankings is somewhat non-trivial. I think maybe you need to have a clearer idea of what you mean by "the collation of the ordered lists" - i.e. what properties do you want it to have?

See for example these CS course notes from Cornell. Given some reasonably sensible properties that the global ranking should have, it's actually impossible to create an algorithm that will definitely satisfy those properties. You may need to compromise in what you will accept as reasonable properties of your global ranking.

Carriole answered 15/6, 2011 at 2:17 Comment(1)
Very helpful. I knew there must be some academic papers on this kind of topic, but I couldn't stumble across the right search terms. "combining rankings" opens up a whole world of papers - thanks!Bagwell
A
0

Items: Duke forever Quake UT3 Duke3d Halo

REV1:
- Duke nukem forever
- Duke3d

REV2:
- Duke3d
- UT3
- Quake

REV3:
- Duke3d
- Duke nukem forever

Logical deduction:

  • REV1: Duke forever > duke3d #
  • REV1: Duke3d < Duke forever ##
  • REV2: Duke3d > UT3 %
  • REV2: UT3 < Duke3d %
  • REV2: Duke3d > Quake %%
  • REV2: Quake < Duke3d %%
  • REV2: UT3 > Quake %%%
  • REV2: Quake < UT3 %%%
  • REV3: Duke3d > Duke forever #
  • REV3: Duke forever < Duke3d ##

Most voted:

  • Duke3d 3
  • Duke forever 2
  • UT3, Quake 1

Remove # that contradict by creating contradicting pairs and convert to = Accept % that match in pairs. Group with count everything. If there are < and = contradictions the true is the one with more count.

Filtered then sorted by logic:

  • Duke3d, Duke forever
  • UT3
  • Quake

But sorting should be influenced by the voting count (Bayesian sort). That way Duke3d would win.

Halo cannot be placed anywhere because it is not voted...

Apron answered 15/6, 2011 at 2:43 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.