Formula for popularity? (based on "like it", "comments", "views")
Asked Answered
M

6

5

I have some pages on a website and I have to create an ordering based on "popularity"/"activity"

The parameters that I have to use are:

  • views to the page
  • comments made on the page (there is a form at the bottom where uses can make comments)
  • clicks made to the "like it" icon

Are there any standards for what a formula for popularity would be? (if not opinions are good too)

(initially I thought of views + 10*comments + 10*likeit)

Maiden answered 9/6, 2010 at 7:13 Comment(2)
How do you assess positive comments vs negative comments? Should 'likeits' be as important as comments?Globose
we don't assess positive vs negative comments. Whether "likeits" should be as important as comments is something that I am throwing out there. I'm pretty flexible. (perhaps "most active" might be a better term than "most popular")Maiden
S
2

There is no standard formula for this (how could there be?)

What you have looks like a fairly normal solution, and would probably work well. Of course, you should play around with the 10's to find values that suit your needs.

Depending on your requirements, you might also want to add in a time factor (i.e. -X points per week) so that old pages become less popular. Alternatively, you could change your "page views" to "page views in the last month". Again, this depends on your needs, it may not be relevant.

Seritaserjeant answered 9/6, 2010 at 7:40 Comment(0)
U
6

Actually there is an accepted best way to calculate this:
http://www.evanmiller.org/how-not-to-sort-by-average-rating.html

You may need to combine 'likes' and 'comments' into a single score, assigning your own weighting factor to each, before plugging it into the formula as the 'positive vote' value.

from the link above:

Score = Lower bound of Wilson score confidence interval for a Bernoulli parameter

We need to balance the proportion of positive ratings with the uncertainty of a small number of observations. Fortunately, the math for this was worked out in 1927 by Edwin B. Wilson. What we want to ask is: Given the ratings I have, there is a 95% chance that the "real" fraction of positive ratings is at least what? Wilson gives the answer. Considering only positive and negative ratings (i.e. not a 5-star scale), the lower bound on the proportion of positive ratings is given by: enter image description here

(Use minus where it says plus/minus to calculate the lower bound.) Here is the observed fraction of positive ratings, zα/2 is the (1-α/2) quantile of the standard normal distribution, and n is the total number of ratings. The same formula implemented in Ruby:

require 'statistics2'

def ci_lower_bound(pos, n, confidence)
    if n == 0
        return 0
    end
    z = Statistics2.pnormaldist(1-(1-confidence)/2)
    phat = 1.0*pos/n
    (phat + z*z/(2*n) - z * Math.sqrt((phat*(1-phat)+z*z/(4*n))/n))/(1+z*z/n)
end

pos is the number of positive ratings, n is the total number of ratings, and confidence refers to the statistical confidence level: pick 0.95 to have a 95% chance that your lower bound is correct, 0.975 to have a 97.5% chance, etc. The z-score in this function never changes, so if you don't have a statistics package handy or if performance is an issue you can always hard-code a value here for z. (Use 1.96 for a confidence level of 0.95.)

The same formula as an SQL query:

SELECT widget_id, ((positive + 1.9208) / (positive + negative) - 
                   1.96 * SQRT((positive * negative) / (positive + negative) + 0.9604) / 
                          (positive + negative)) / (1 + 3.8416 / (positive + negative)) 
       AS ci_lower_bound FROM widgets WHERE positive + negative > 0 
       ORDER BY ci_lower_bound DESC;
Uzziel answered 22/1, 2014 at 16:11 Comment(0)
T
2

You could do something like what YouTube does - just have it sorted by largest count per category. For example - most viewed, most commented, most liked. In each category a different page could come first, though the rankings might likely be correlated. If you only need a single ranking, then you would have to come up with a formula of some sort, preferably derived empirically by analyzing a bunch of data you already have and deciding what should be calculated as good/bad, and working backwards to fit an equation that fits your decision.

You could even attempt a machine learning approach to "learn" what a good weighting is for combining each of these numbers as in your example formula. Doing it manually might also not be too hard.

Talkative answered 9/6, 2010 at 7:40 Comment(1)
Thanks for the idea, the options you propose are already options on the result list. The final "overall popularity" is what I am trying to get at here.Maiden
S
2

There is no standard formula for this (how could there be?)

What you have looks like a fairly normal solution, and would probably work well. Of course, you should play around with the 10's to find values that suit your needs.

Depending on your requirements, you might also want to add in a time factor (i.e. -X points per week) so that old pages become less popular. Alternatively, you could change your "page views" to "page views in the last month". Again, this depends on your needs, it may not be relevant.

Seritaserjeant answered 9/6, 2010 at 7:40 Comment(0)
G
2

I use,

(C*comments + L*likeit)*100/views

where you must use C and L depending on how much you value each attribute. I use C=1 and L=1.

This gives you the percentage of views that generated a positive action, making the items with higher percentage the most "popular". I like this because it makes it possible for newer items to be very popular at first, showing up first and getting more views and thus becoming less popular (or more) until stabilizing.

Anyway, i hope it helps. PS: Of it would work just the same without the "*100" but i like percentages.

Groscr answered 25/7, 2011 at 19:21 Comment(1)
Really nice. I like how effective and simple this is.Blame
G
0

I would value comments more than 'like it's if the content invites a discussion. If it's just stating facts, an equal ration for comments and the like count seems ok (though 10 is a bit too much, I think...)

Does visit take into account the time the user spent somehow? You might use that, as well, as a 2 second view means less than a 3 minute one.

Guy answered 9/6, 2010 at 7:47 Comment(0)
M
0

Java code for Anentropic's answer:

public static double getRank(double thumbsUp, double thumbsDown) {
  double totalVotes = thumbsUp + thumbsDown;

  if (totalVotes > 0) {
    return ((thumbsUp + 1.9208) / totalVotes - 
      1.96 * Math.sqrt((thumbsUp * thumbsDown) / totalVotes + 0.9604) / 
      totalVotes) / (1 + (3.8416 / totalVotes));
  } else {
    return 0;
  }
}
Margetts answered 5/2, 2020 at 23:20 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.