Prooffreader graphed the distribution of letters towards the beginning, middle and end of English words, using a variety of corpora, finding both some obvious truths and some surprising ones. As soon as I saw this, I began to think of the ways that you could use it to design word games -- everything from improved Boggle dice to automated Hangman strategies to altogether new games.
Now then: I became curious about how letters are placed in English while doing many different, often quick, sometimes pointless, pattern analyses of letters for a wide variety of reasons. (One example: for one art project that will hopefully be posted on this blog one day, I found all the anagrams of "Hollywood", and noticed that words beginning with "w" were overrepresented.)
I've had many "oh, yeah" moments looking over the graphs. For example, words almost never begin with "x", but it's quite common as the second letter. There's a little hump near the beginning of "u" that's caused by its proximity to "q", which is most common at the beginning of a word. When you remove "q" from the dataset, the hump disappears. "F" occurs toward the extremes, especially in prepositions ("for", "from", "of", "off") but rarely just before the middle.
A final thought: the most common word in the English language is "the", which makes up about 6% of most corpuses (sorry, corpora). But according to these graphs, the most representative word is "toe".
Graphing the distribution of English letters towards the beginning, middle or end of words
(via Hacker News)
In The Association Between Income and Life Expectancy in the United States, 2001-2014, published in the Journal of the American Medical Association, economists from Stanford, MIT and Harvard analyzed 1.4 million US tax records to see how income correlated with lifespan.
Hal Varian, now Google’s chief economist, wrote “How to Build an Economic Model in Your Spare Time,” a classic paper, in 1994 while teaching at UC Berkeley (he’s still an emeritus there).
In Evaluation of the potential for virus dispersal during hand drying: a comparison of three methods, published in The Journal of Applied Microbiology, researchers from the University of Westminster showed that viruses applied to rubber gloves were aerosolized by Dyson Handblade hand-dryers and spread further than viruses and other germs would be by conventional hand-dryers […]
Isn’t it about time to stretch what your Mac can do? I mean, you’ve got plenty of great programs now…but don’t you think you could use some new tools to get your creative, analytical and organizational juices really flowing? It’s spring, so we cleaned up a whole bunch of super-cool apps lying around and packaged […]
In the world of app development, there’s no greater arena to find success than with Android users. About 80% of the smartphones in use today worldwide operate on the Android operating system, so if you build a great app that Android users love, you’re an international rock star. You’ll be able to make sure your […]
Unless you’re a programmer or webmaster, the term SQL probably doesn’t mean much to you. But for those looking to understand more about how and why the web works the way that it does, know this – SQL and its process of managing and presenting large data sets is everywhere…and it’s the most in-demand programming […]