Thursday, October 18, 2007

State of the Union Statistics


Here is an interactive view of the State of the Union addresses from all the US Presidents. Larger word clouds indicate the words that were used more frequently. You can select a President and compare his address (in red) with another's (in white). Up and down arrows can help eliminate word clustering. The bar chart at the bottom indicates the total number of words in each address. A very nice use of frequency statistics.

George W. Bush has a fondness for Julie, shown on the right. This is Julie Aigner-Clark who developed the Baby Einstein videos and cashed in when she sold out to Disney. Of course, the videos have now been shown to be ineffective. Hear the NPR story.

Friday, October 5, 2007

Beware correlations on averages

A recent article on Cuba brought to mind how correlations on averages can be very different from correlations on individuals. Due to unfortunate economic conditions, the 1990s found the Cuban people both low on food and fuel. They ate less, (daily energy intake declined from 2,899 kcal in 1988 to 1,863 kcal in 1993) and they exercised more as a result of widespread use of bicycles and walking as alternative means of transportation.

As a result, obesity declined, as did deaths attributed to diabetes, coronary heart disease and stroke. This corresponds with much of what is known about increasing the length of life through caloric restriction. Laboratory tests on many organisms have shown a negative correlation between caloric consumption and length of life. The example from Cuba is a natural experiment on humans that appears to indicate that daily calorie intake is negatively correlated with life expectancy. Of course, the Cuban example has a confounding variable of increased exercise. So we can't say directly that the less you eat the longer you live. But contrast this with data from the UN through the FAO on average daily calorie intake and average life expectancy by country, shown above. The correlation is positive 0.72. Of course, this correlation is confounded with wealth, health care, etc.

If individuals generally respond to decreased caloric intake the same way that the citizens of Cuba did, then hypothetical scatterplots for individuals within each country would have a negative correlation:

Here the ellipsoidal regions indicate confidence regions for the mean caloric intake and length of life for individuals in each country. The tilt in the ellipses indicate the negative correlation. Now I don't have such data. But the Cuban example suggests that the correlations may well be negative within a country but positive across the countries of the world.

A nice example that tells us again to beware of correlations on averages. They may not reflect the correlations on individuals.

Wednesday, October 3, 2007

Normal Dining?


Here is an image of a bell-shaped distribution on the baseboards of booths at a restaurant in Bethesda, Maryland. Could it be that the pattern is caused when the wait staff set, clear,or deliver food to the table? Imagine a server scuffing the baseboards with his shoes, rubbing the wood smooth in a pattern that shows typical foot placement. Or is the pattern the residual polish left after customers scuff the wood next to their seats as they slide into the booth? I think I prefer the former explanation.
What do you think?