Adventures in Reddit Statistics
June 17, 2009 12:32 pm UncategorizedThe data set consists of 1 month of Reddit top submissions (gathered when submissions were syndicated via Reddit’s main RSS feed).
- Plotting upvotes vs. downvotes by submission yields a predictable upvote vs. downvote ratio–practically a 2 to 1 average.
- Bias is inherent in Reddit’s ranking algorithm since submissions that get more downvotes than upvotes don’t make it to the top. There are definitely (although relatively few) outlier submissions with greater-than-usual upvotes vs. downvotes.
- One can’t conclude that the algorithm (unless you actually know the code) actually filters submissions near this ratio. If not then the voting Reddit community at large votes rather predictably within limited deviation along the above trendline.
UPDATE: Reddit discussion here. Also a log x log chart after the jump.



June 17th, 2009 at 6:30 pm
I would like to know if time is a factor. Submissions that are immediately downvoted don’t seem to reach the front page no matter how popular they eventually become! I call it the Reddit AlGore-rhythm!
BravoLima
June 17th, 2009 at 9:13 pm
Interesting analysis - I wonder how similar social news sites would compare. I did some research that covered Reddit that came out last year, though it looked at using Reddit for information retrieval.