A: Santa Claus exists,
if I am not mistaken. B: Well, of course Santa Claus existsif you are not
mistaken! A: So, I was right. B: Yes. A: So, I was not mistaken. B: Yes. A: Hence, Santa Claus exists.
This is a ladder leading to an elevated playhouse for kids visiting Brookside Gardens in Maryland. The ladder reveals a frequency distribution of foot placement wear. On most steps we see more wear in the middle of the ladder rung and less wear towards the left and right edges. This leaves a bell-shaped pattern of use.
But a rung near the bottom has a bi-modal pattern, showing the wear resulting from both left and right feet. This doesn't persist on higher rungs. There the wear seems more central. So why not on the lower rung as well? Perhaps central steps on higher rungs feel safer. A care that is not that needed closer to the ground.
Here is an image from imgur of the fret board on a 1956 Fender Stratocaster guitar showing the frequency distribution of playing wear (thanks Scott). But many on Reddit disagree, calling it faked. Not being a guitar player, I can't judge. What do you think?
Mathematician Henry Segerman demonstrates his 3D printed skewed or squished six-sided dice. He states "they work like ordinary dice". He exploits 3D symmetries to produce these isohedral dice with "just enough symmetry to be fair". Check out his video above and links to his extra footage.
This is a view of a basket of menus at a restaurant in
Chincoteague, Virginia. Notice the pattern of marks left as groups of menus scratch
the wall when they are returned to the basket. As customers are seated at the restaurant, they are given
menus that are removed from the right hand side of the basket. After ordering, the menus are returned to the basket and placed to the right of the remaining menus. When a single is menu returned, it nicks the wall at a location that depends on how
many menus are currently in the hands of customers. If few menus are out with
the customers, more remain in the basket and the wall marks of this returning
menu will be further to the right. If many menus are out with customers, say just before the lunch rush, this returning menu will make a
mark on the wall further to the left.
But it is not often that a single menu is returned alone. It is much more likely that a group of menus will be returned to the
basket in a bunch. The size of the bunch that is returned is random depending
on the size of the party seated. Each of the menus in the bunch makes a mark on
the wall as they are returned to the basket. What we see is a steady state
distribution of the number of seated customers, with menus in hand, waiting for
their order to be taken.
In the previous post we saw use of the program Galton that maps out on city streets how far you can travel in 10 or 20 minutes. Displayed on a rectangular array of streets and avenues, square or rectangular regions develop, as walking is constrained to follow the paths of the gridded streets.
The image above is a Google Earth view of the parking lot of an office building in Maryland. Commuters have parked their cars to enter a building just off the image at the lower left. They must follow perpendicular paths and walk between the cars and/or along the lanes to enter the building. But to minimize the distance of the walk, most have parked along lines of equal distance from the bottom left according to the Manhattan or city block metric. A few stragglers don't fit this pattern, perhaps wanting to protect their cars from door dings or just get a little extra exercise. But the prominent pattern in the image above is one quadrant of the rectangular 'circle' of the city block metric.
Urbica is a design firm specializing in urban data analysis. They have developed a program called Galton that graphs, for a few select cities, how far you could walk in 10 minutes (in dark blue) or 20 minutes (in lighter blue). The map of Manhattan above shows those regions for a walk originating at Broadway and 42st Street. As you walk NYC you are, for the most part, constrained to travel the grid of avenues and streets. Of course, you cannot travel as the crow flies. If you could, these regions would be concentric circles with a perimeter an equal (Euclidean) distance from your start. But walking the streets, your distance is measured by the city block metric (also known as the taxicab metric or more appropriate here the Manhattan metric). This measures distances constrained along perpendicular avenues and streets. Plotting points of equal distance with this metric from would result in the roughly rectangular (or diamond-shaped) regions shown above. Since the streets and avenues are not equally spaced and obstacles can block our travel, we don't see perfect square or rectangular regions. By the Manhattan metric, circles become squares.
Via Maps Mania.
Next week, we will see directly the results of minimizing the distance traveled in such constrained walking.
This is an image from the e-book Information is Beautiful from the site of the same name. But they've made the same mistake that has long served to help people learn How to Lie with Statistics by Darrell Huff. The graphic below, from Huff's book, shows the comparison of bags of money representing an average weekly wages from two countries. The one on the right earning twice as much as the one on the left. Showing how to lie, the bags are drawn so that the on the right is twice as tall as the one on the left, but this doesn't give us the correct visual image, since the graphic artist has doubled both the height and the width to produce the image on the right. The resulting visual impression is that the bag on the right is four times the one on the right. A perfect image to mislead.
Now the graphic from Information is Beautiful purports to represent the percentage of children in poverty. On the left is shown a small shadow silhouette of a young child with arms raised that represents 2%, the percentage of children in poverty in Denmark. Compare this with the larger silhouette for Germany representing 10%. Even in this flat 2-d outline more than 5 of the Denmark outlines could fit in the Germany outline. The problem, of course, is that a graphic artist has lied again and doubled both the height and width of the silhouettes to represent these numbers. This distorts any comparisons that could be made with these data. And it gets worse, since we are to understand these are images of 3-dimensional children!
I recall decades ago this, now classic graphic, when it appeared in the Washington Post. The same mistake of displaying data using the same error with the same methods (compare the Eisenhower dollar with the Carter dollar). You would think....
Using data from over 175,000 rentals from the real estate listing service StreetEasy, the site FiveThirtyEight asks the question:
How much would you be willing to pay to shave a minute off your commute? For New Yorkers, the answer appears to be around $56 per month. That’s how much more New Yorkers pay in rent, on average, for a one-bedroom apartment that’s a minute closer by subway to Manhattan’s main business districts.
They plot median monthly rental versus commuting time to the 42nd Street and Chambers Street subway stations. They fit, what appear to be non-parametric regressions, curves to four different groups of rentals: studios, and 1- ,2- , and 3-bedroom apartments. As expected, rental prices fall with a longer commute. From the 1-bedroom curve they estimate that, on average, a one minute shorter commute costs around $56 more a month.
Along with reading the Sunday newspaper, CBS Sunday Morning is a favorite in our household. Click on the video above for a short report on a streaming pay TV customer satisfaction survey conducted by JD Power. The report mentions four categories: "cord cutters", those who have cancelled TV service, "cord nevers", those who have never subscribed to pay TV and only subscribe to a streaming video service, "cord shavers", those who still subscribe but now to a downgraded TV service, and "cord stackers", those who keep pay TV but also use streaming. From the report:
The inaugural study measures overall satisfaction among customers who
have used a subscription- or transaction-based streaming video service within
the past six months. The study measures customer satisfaction by examining six
key measures (listed in order of importance): performance and reliability;
content; cost of service; ease of use; communication; and customer service.
Scores for each measure are reflected in an index based on a 1,000-point scale.
The measures for "cutters", "nevers", "shavers", and "stackers" are 802, 807, 822, and 826. Here is a frame from late in the video that reports on the right most bar "stackers" as it relates to the other categories."
It's clear that that Sunday Morning is in sore need of knowledgeable graphics editors. Perhaps their artists make their charts telegenic by filling them up with what we would call "chart junk", but in ignoring the proper representation of the data, they are presenting false and misleading images. The numbers are said to be represented here, are again, 802, 807, 822, and 826. The second bar from left almost looks twice as tall as the left-most bar and not a bar that should only be about 0.6% taller. The third bar from the left looks about 3 times taller than the left-most bar and it should only be about 2.5% taller. Finally, the right-most "cord stackers" bar should be under 3% taller, not over 4 times taller! Yes, they give a grid background to judge the sizes, but they are misleading sizes to judge. The 'smiley' satisfaction faces are probably the most reliable description of the data: The satisfaction scores are amazingly the same. From smallest to largest they vary by less than 3%.
Here are the data shown on the graph and then a more accurate rendering (without the chart junk).
Not much difference in satisfaction across all type customers.
As part of the Virginia Tourism Corporation's promotion campaign, this LOVE artwork has been placed in the Robert Reed waterfront park in Chincoteague, Virginia. It displays four, 10 foot tall Adirondack chairs. The wooden chairs spell out the LOVE with the symbols L O ♥ E.
Of course, when posing for pictures, tourists prefer to sit in the ♥ chair over the others as evidenced by the greater frequency of wear on the chairs as people climb up and rub off the paint.
This view looks at the letters in reverse. The nearest chair is E, the next is ♥ with the most wear, then O and L. Note that the pattern of wear is in a bell-shape, with more wear near the middle of the seat and less towards the edges.
Another Eastern Shore vacationoutlier find. This corn stalk, amid the soybeans, is a triple threat outlier: by type, by height, and by location. Enjoy the rest of your summer.
This is a view of the side of a small counter at a fast food restaurant in Snow Hill, Maryland. Patrons have slid this chair back and forth to sit at or leave the adjacent table. This chair movement has marred the paneling of the counter into a pattern that is skewed to the right: much more wear on the left with decreasing use and wear as the chair is set closer to the table. Of course, on the right, the chair's wear pattern is truncated since it must stop short of the table. On the left, we've got no wall to reveal the chair's position. The chair's wear is censored. What remains is a right skewed pattern of the frequency of use and wear. The pattern somewhat resembles the pattern of a sample from a right skewed exponential distribution.
Here's a utility pole at a traffic intersection in Aspen Hill, Maryland. The pole has served as display for the many yard sales, community meetings, and businesses that have had their advertising flyers posted on the pole. The flyers have long since been removed. Only their staples and nails remain. These accumulated staples show a
distribution of the heights of flyer postings.
The close-up view below shows the distribution of individual staples and nails on the cylinder of the pole. The staples are
distributed both around the pole and vertically up and down the pole. Vertically, it's too difficult to put flyers high on the pole and few staples can be found there. Flyers very low on the pole wouldn't be easily seen by those passing by, so few staples are also found there. Most staples and nails are at a comfortable shoulder and viewing height. If we imagine the height of a staple above the ground is our random variable, we find few staples with small height, few with large height, and many more with a medium height. This is a bell-shaped pattern up and down the pole that we have seen often.
From Flowing Data, an interactive dotplot showing the distribution of annual salaries in various fields. Selections can be made for the 1960s (above), the 1980s, 2000s, and 2014. As a time range is selected, the dots representing the annual salaries of 50 randomly selected people, dynamically redistributed themselves to reflect the times salary frequency distribution. Compare the dramatic change in spread from 1960s above to the 2000s below.
Yet Another Door Distribution Again, this time at the Summer Shack Restaurant in Boston. Not many patrons grab the door near the handle, not many reach it much higher. Most grab the door, and wear away its paint, at a comfortable, likely shoulder height. A bell-shaped frequency distribution results. Thanks Laura.
Oliver Stone's 1989 film Born on the Fourth of July tells the story of Ron Kovic, US Marine and anti-war activist. Kovic was portrayed by Tom Cruise, as advertised in the film's poster above. The poster's dominate color is black with, as expected, red, white, and blue, but also oranges and yellows in the face tones. Photoshop reveals the poster's colors in the swatches below.
But he has done more. He has looked at the colors in movie posters from 1914 to 2012 and produced an interactive image where you can select any year within this range and see the pie chart of movie posters from that year. Here is a still image of his interactive one.
He also produces an interactive image (still image below) with lightness and saturation ignored.
Orlando is reeling. Here it is in more tranquil times: Lake Eola Park downtown, the site of many vigils this past week.
Brian Resnick and Javier Zarracina from Vox have a cartoon explaining mathematically that predicting a mass shooting, like Pulse, is beyond our abilities. They consider a prediction that is 99% accurate in detecting a lone mass shooter. That shooter, hiding within a group of 1000 people, could be labeled by such a prediction, but that same prediction could label another 9 law abiding folks as potential threats.
If such a prediction scheme was used for the the 323 million people of the US, we could have a false positive group of over 3.2 million!
On a trip to visit family, we stopped at a gas station in Hammondville, Alabama. On the wall was a map of the US with this wear pattern of customers touching where they were on the map. The many touchers have worn though the paper map down to the underlying supporting board. It seems that many have traced their path of travel extending southwest to Birmingham, AL and northwest to Chattanooga, TN likely along the connecting route US 11, passing through Hammondville, or along the parallel interstate 59 a bit further west. What remains is a roughly ellipsoidal bivariate frequency distribution of wear with a greater frequency of wear centered on Hammondville and lesser frequencies of wear in ellipsoidal contours around it.
Not a statistical image this time, but a
scene from the espionage novel "The Gun Seller" by writer, actor,
musician Hugh Laurie. He uses an old example with a misunderstanding of the
probability of joint events. Consider two events B1 an B2
each occurring with the same probability P(B1) = P(B2) = p. If they occur together, their joint
probability is denoted P(B1 and B2). We can think sequentially and consider
P(B1
and B2) = P(B1)P(B2
given B1) and if they are
independent
P(B2 given B1)
= P(B2) = p. So
that under independence P(B1 and B2) = p2, much smaller than p.Notice how he relates this below.
There was a bomb scare on the
flight out to Prague. No bomb, but lots of scare.We were just settling ourselves
into our seats when the pilot’s voice came over theintercom, telling us to deplane
with all possible speed. No ‘ladies and gentlemen, on behalf of British Airways,’ or anything like
that. Just get off the plane now.
We hung around in a lilac-painted
room, with ten fewer chairs than there were passengers and no music to play by, and you
weren’t allowed to smoke. I was, though. A uniformed woman with a lot of make-up told me
to put it out, but I explained that I was asthmatic and the cigarette was a herbal dilation
remedy I had to take whenever I was under stress. Everybody hated me for that, the smokers even
more than the non-smokers.When we finally shuffled back on to
the aircraft, we all looked under our seats, worriedthat the sniffer dog might have had
a cold that day, and that somewhere there was a little black hold-all that all the searchers had
missed.
There once was a man who went to see
a psychiatrist, crippled by a fear of flying. His phobia was based on the belief that
there would be a bomb on any plane he boarded. The psychiatrist tried to shift the
phobia but couldn’t, so he sent his patient to a statistician. The statistician prodded a calculator
and informed the man that the odds against there being a bomb on board the next flight he
took were half a million to one. The man still wasn’t happy, and sat there convinced that he’d
be on that one plane out of half a million. So the statistician prodded the calculator again and
said ‘all right, would you feel safer if the odds were ten million to one against?’ The man
said, yes, of course he would. So the statistician said ‘the odds against there being two,
separate, unrelated bombs on board your next flight are exactly ten million to one against.’ The
man looked puzzled, and said ‘that’s all well and good, but how does it help me?’ The
statistician replied: ‘It’s very simple. You take a bomb on board with you.'
I told this to a grey-suited
businessman from Leicester, sitting in the seat next to me, but he didn’t laugh at all. Instead, he
called a stewardess and said he thought I had a bomb in my luggage. I had to tell the story
again to the stewardess, and a third time to the co-pilot who came back and squatted at my feet
with a scowl on his face. I’m never going to make polite conversation ever again.
Perhaps I’d misjudged how people
feel about bombs on aeroplanes. That’s possible. A more likely explanation is that I
was the only person on the flight who knew where the hoax bomb call had come from, and what it meant.
Of course no statistician would suggest such an
action. Comic irony or real misunderstanding?
Nathan Yau at Flowing Data has produced an interactive graphic to compare your age with others. What percentage of the US population is younger than you? What percentage is older? In the static image above the US median age appears to be about 37 years old. Based on a 5-year American Community Survey from 2014, his interactive graphs lets you slide the line to match any age and see the percentage of Americans older or younger.
Their research questions wondered if more recent surveys from 1985, 1996, and 2008 showed increasing importance to men of "good financial prospect." It had only partial support. Their next question wondered if there was increasing importance among women for "good cook and housekeeper" and "desire for home and children". It had no support. Finally, they wondered if there would be decreasing importance among women for "good financial prospect". It had no support.
The graphic above takes their widest view of mate preferences where "mutual attraction - love", "education - intelligence", "sociability", "good lucks", and "good financial prospect" have increased in importance since 1939 for both sexes. "Desire for home, children" has increased for men but decreased for women. And for men "ambition and industriousness" has dropped since 1939. "Chastity" has decreased in importance for both sexes.
It's the end of the semester and I saw this Post-It Note graphic that shows exactly how many people, especially students, approach an assignment's deadline. From Instagrammer Insta-Chaz via Visual News.
(Earlier I posted Happy Birthday wishes, misled as others were on the web, but I was wrong. It is Will's Deathday and I corrected my posting, although I thought only Saints were to be remembered on their Death-day. Shakespeare was a genius, but I don't think many would call him a Saint).
Design agency ferdio has put together graphics illustrating the design, colors, and symbols of national flags. Patterns, layouts, ages etc. are collected together in what they have call Flag Stories. Above is one such story of flags stacked into a bar chart showing the frequencies of the number of colors in the flags. As they mention, over a third of countries favor flags with three or four colors. You can find many more at Flag Stories. We've seen flag colors before.
It didn't happen yet again. In last month's NCAA "March Madness" men's college basketball championship, a number of 16 seeds have again failed to best the number 1 seeds. The histogram above shows the score differences in such matchings since 1985. It closely matches a normal distribution, allowing for us to estimate the probability that such an upset could happen as the area under the approximating bell-curve that falls beyond zero. Our estimated probability that a number 16 seed would beat a number 1 seed is 0.0208. It has risen a bit since our last view.
Here is a relief model of a normal curve that was developed to aid teaching statistics to the visually impaired. Students would trace their fingers along the raised impression of the curve and its divisions into standard deviation intervals to gain experience and understanding of the normal curve and how it describes the normal distribution of measurements along its horizontal axis, distinctions that we have seen repeatedly on this blog. In this figure, the lines representing one standard deviation above and below the mean seem to fall a bit short of the curve's two points of inflection, where they should naturally fall. But this is an excellent concept for aiding those with visual impairments.
Flowing Data has produced an interactive graphic showing the distribution of the age of marriage of Americans. Smoothed relative frequency distributions are shown for women (in green) and men (in orange) with selections possible by employment, education, race, or whether or not it is a first marriage. The data are from the American Community Survey marriages from 2009 to 2014. My guess is that similar frequency distributions from earlier decades would have modes that move more to the left towards younger ages for both women and me. What is the overall youngest median age of marriage for Americans? More data needed.
Yet Another Door Distribution Again. This time on a bakery door just east of Cambridge, Maryland. Here we see a skewed frequency distribution of scratches, perhaps from keys, with greatest concentration around the door handle with progressively fewer scratches extending higher up the door. Many fewer scratches are below the door. Perhaps the handle is too low as customers handle their keys and hold or open the door, while also eating the delicious donuts the bakery sells.