Monday, April 29, 2013

Kernel Pinterest

Here is a nice idea for displaying a bit more than summary statistics on the variables included in regression studies. This is from the paper "I Need to Try This!": A Statistical Overview of Pinterest. Pinterest is a pin-board photo sharing website. Among other things, this study models the number of re-pins of a given photo with a Negative-Binomial regression.

The table above shows the medians, means, and maxima for non-negative count data included in the regressions. The minima are all zero. Along with these summary statistics are small thumb-nail kernel density estimates of the distributions of the variables. Now granted the variables involved take on only integer values and these distribution curves are continuous, but it is much better than the usual limited summary statistics, shown below, that are often given in other regression studies.

No comments: