The 4-letter word "four" is apportioned here into only 5 bins. These bin percentages are accumulated across all the words in the Brown corpus via the Natural Language Toolkit. What remains is deciding what aspect of these accumulated percentages of ordinal data to plot to provide an informative display. If the raw percentages are used, comparisons are difficult between frequently used letters like "a" and rarely used ones like "z".
Normalizing the y-axis so that 100% represents each letter's greatest frequency is another approach. But he argues this makes interpretation difficult since the vertical scales really are not comparable.