How should I order these "helpful" scores?
Asked Answered
R

4

6

Under the user generated posts on my site, I have an Amazon-like rating system:

   Was this review helpful to you: Yes | No

If there are votes, I display the results above that line like so:

   5 of 8 people found this reply helpful.

I would like to sort the posts based upon these rankings. If you were ranking from most helpful to least helpful, how would you order the following posts?

   a) 1/1 = 100% helpful
   b) 2/2 = 100% helpful
   c) 999/1000 = 99.9% helpful
   b) 3/4 = 75% helpful
   e) 299/400 = 74.8% helpful

Clearly, its not right to sort just on the percent helpful, somehow the total votes should be factored in. Is there a standard way of doing this?

UPDATE:

Using Charles' formulas to calculate the Agresti-Coull lower range and sorting on it, this is how the above examples would sort:

   1) 999/1000 (99.9%) = 95% likely to fall in 'helpfulness' range of 99.2% to 100%
   2) 299/400 (74.8%) = 95% likely to fall in 'helpfulness' range of 69.6% to 79.3%
   3) 3/4 (75%) = 95% likely to fall in 'helpfulness' range of 24.7% to 97.5%
   4) 2/2 (100%) = 95% likely to fall in 'helpfulness' range of 23.7% to 100%
   5) 1/1 (100%) = 95% likely to fall in 'helpfulness' range of 13.3% to 100%

Intuitively, this feels right.

UPDATE 2:

From an application point of view, I don't want to be running these calculations every time I pull up a list of posts. I'm thinking I'll either update and store the Agresti-Coull lower bound either on a regular, cron-driven schedule (updating only those posts which have received a vote since the last run) or update it whenever a new vote is received.

Revisory answered 20/9, 2010 at 6:34 Comment(0)
P
6

For each post, generate bounds on how helpful you expect it to be. I prefer to use the Agresti-Coull interval. Pseudocode:

float AgrestiCoullLower(int n, int k) {
  //float conf = 0.05;  // 95% confidence interval
  float kappa = 2.24140273; // In general, kappa = ierfc(conf/2)*sqrt(2)
  float kest=k+kappa^2/2;
  float nest=n+kappa^2;
  float pest=kest/nest;
  float radius=kappa*sqrt(pest*(1-pest)/nest);
  return max(0,pest-radius); // Lower bound
  // Upper bound is min(1,pest+radius)
}

Then take the lower end of the estimate and sort on this. So the 2/2 is (by Agresti-Coull) 95% likely to fall in the 'helpfulness' range 23.7% to 100%, so it sorts below the 999/1000 which has range 99.2% to 100% (since .237 < .992).

Edit: Since some people seem to have found this helpful (ha ha), let me note that the algorithm can be tweaked based on how confident/risk-averse you want to be. The less confidence you need, the more willing you will be to abandon the 'proven' (high-vote) reviews for the untested but high-scoring reviews. A 90% confidence interval gives kappa = 1.95996398, an 85% confidence interval gives 1.78046434, a 75% confidence interval gives 1.53412054, and the all-caution-to-the-wind 50% confidence interval gives 1.15034938.

The 50% confidence interval gives

1) 999/1000 (99.7%) = 50% likely to fall in 'helpfulness' range of 99.7% to 100%
2) 299/400 (72.2%) = 50% likely to fall in 'helpfulness' range of 72.2% to 77.2%
3) 2/2 (54.9%) = 50% likely to fall in 'helpfulness' range of 54.9% to 100%
4) 3/4 (45.7%) = 50% likely to fall in 'helpfulness' range of 45.7% to 91.9%
5) 1/1 (37.5%) = 50% likely to fall in 'helpfulness' range of 37.5% to 100%

which isn't that different overall, but it does prefer the 2/2 to the safety of the 3/4.

Photoluminescence answered 20/9, 2010 at 15:28 Comment(5)
For ties (especially those at 0), I suggest breaking in favor of largest number of upvotes, then smallest number of downvotes.Photoluminescence
wow, Charles, this is hard core. very impressive. i'll run it on my examples and see how they sort (after i spend a few minutes educating myself on Agresti-Coull at wikipedia!)Revisory
Let me know how it goes. I can give more information and/or references as needed.Photoluminescence
+1 for this elegant solution (ordering by the lower end of the confidence interval). Just out of curiosity: how does the size of the interval behave for number of upvotes = 0 or = number of answers ? (the plain Binomial variance goes to zero in these cases)Earplug
@Andre: Asymptotically, it decreases like 1/n, or rather C/n where C depends on the chosen confidence.Photoluminescence
E
4

This question is probably better asked on http://stats.stackexchange.com .

I guess you still want to order by increasing of 'helpfulness'.

If you want to know how precise a given number is, the simplest way is to use the square root of the variance of the Binomial distribution with n equal to the total number of responses and p the fraction of responses which were 'helpful'.

Earplug answered 20/9, 2010 at 6:39 Comment(0)
R
1

A very simple solution would be to ignore everything with less than a cut-off amount of votes, and then sort by percentage.

For example (require at least five votes)

   1.  99.9% (1000 votes)
   2.  74.8%  (400 votes)
   3-5.  waiting for five votes
Reichard answered 20/9, 2010 at 6:42 Comment(0)
R
1

It depends on the expected rate of positive feedback and the number of the people that vote on average. If, like in the example you give, you are going to have sometimes 5 and 10 people voting and other times a 1000, then I would suggest the Wilson midpoint:

(x+z^2/2)/(n+z^2)    The midpoint of the Adjusted Wald Interval / Wilson Score

where:
n = Sum(all_votes),  
x = Sum(positive_votes) / n, 
z = 1.96 (fixed value)
Robson answered 12/2, 2011 at 9:35 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.