Prooffreader graphed the distribution of letters towards the beginning, middle and end of English words, using a variety of corpora, finding both some obvious truths and some surprising ones. As soon as I saw this, I began to think of the ways that you could use it to design word games -- everything from improved Boggle dice to automated Hangman strategies to altogether new games.
Now then: I became curious about how letters are placed in English while doing many different, often quick, sometimes pointless, pattern analyses of letters for a wide variety of reasons. (One example: for one art project that will hopefully be posted on this blog one day, I found all the anagrams of "Hollywood", and noticed that words beginning with "w" were overrepresented.)
I've had many "oh, yeah" moments looking over the graphs. For example, words almost never begin with "x", but it's quite common as the second letter. There's a little hump near the beginning of "u" that's caused by its proximity to "q", which is most common at the beginning of a word. When you remove "q" from the dataset, the hump disappears. "F" occurs toward the extremes, especially in prepositions ("for", "from", "of", "off") but rarely just before the middle.
A final thought: the most common word in the English language is "the", which makes up about 6% of most corpuses (sorry, corpora). But according to these graphs, the most representative word is "toe".
Graphing the distribution of English letters towards the beginning, middle or end of words
(via Hacker News)
In Millionaire Migration and the Taxation of the Elite: Evidence from Administrative Data, Stanford sociologist Cristobal Young builds on his substantial research on “millionaire migration,” to show that only a small minority of millionaires move when local taxes go up — far too few to represent a net loss to the tax coffers.
In Evaluating the privacy properties of telephone metadata, a paper by researchers from Stanford’s departments of Law and Computer Science published in Proceedings of the National Academy of Sciences, the authors analyzed metadata from six months’ worth of volunteers’ phone logs to see what kind of compromising information they could extract from them.
Jongha Choi’s Master’s thesis for Design Academy Eindhoven involved the creation of “De-dimension” furniture, which collapses into a flat, easily stored form when it’s not in use — but when it’s in its flat form, it looks like a perspective drawing of its expanded shape.
If you want to add some real firepower to your programming repertoire, learn Java–one of the most adaptable, widely-used programming platforms around. You can easily do that with this Ultimate Java bundle, now just $69 in the Boing Boing Store.Across 14 lectures and 117 hours of content, the educators at online academy eduCBA will walk you through […]
Every company wants to harness the power of social media, but few understand how to make that happen. Be one of those select few with this Social Media Marketing Course & Certification package, now just $29 in the Boing Boing Store.Over 12 modules of course material, you’ll learn what it takes to increase a brand’s […]
If you’ve got a killer app idea, but don’t have the technical expertise to pull it off, get a crash course in all things app development with the Comprehensive Android Development Bundle, now over 90% off in the Boing Boing Store. Across 83 hours of training, you’ll learn to develop for the world’s most popular mobile OS, mastering […]