Monday, July 30, 2012

Happy Tweets from a Jellyfish

 Average Happiness
A clever "jellyfish plot" from researchers at the Center for Complex Systems at the University of Vermont showing indexed percentiles from a study of the Positivity of the English Language. Words are scored by 50 people on a happiness scale (1=unhappy, 5=neutral, 9=happy). These 50 scores are averaged to produce an average happiness measure for each word. This is done for the 5000 most frequently occurring words on Twitter. Each point represents a word plotted by its average happiness and its rank by frequency of use (1=most frequent, 5000=least frequent). Percentiles (here deciles) of average happiness are graphed in a sliding window of 500 ranked words.

The plot shows that generally the distributions of happiness, in frequently used or not so frequently used words, are skewed towards happier words, generally matching the marginal distribution curve shown on top. This finding was consistent across words from the New York Times, Google Books, and Music Lyrics.

But note the spread  in the deciles decreases as the word usage decreases, and their average happiness generally gets smaller for less frequently used words. More details and graphs can also be found here.

No comments: