Prooffreader graphed the distribution of letters towards the beginning, middle and end of English words, using a variety of corpora, finding both some obvious truths and some surprising ones. As soon as I saw this, I began to think of the ways that you could use it to design word games -- everything from improved Boggle dice to automated Hangman strategies to altogether new games.
Now then: I became curious about how letters are placed in English while doing many different, often quick, sometimes pointless, pattern analyses of letters for a wide variety of reasons. (One example: for one art project that will hopefully be posted on this blog one day, I found all the anagrams of "Hollywood", and noticed that words beginning with "w" were overrepresented.)
I've had many "oh, yeah" moments looking over the graphs. For example, words almost never begin with "x", but it's quite common as the second letter. There's a little hump near the beginning of "u" that's caused by its proximity to "q", which is most common at the beginning of a word. When you remove "q" from the dataset, the hump disappears. "F" occurs toward the extremes, especially in prepositions ("for", "from", "of", "off") but rarely just before the middle.
A final thought: the most common word in the English language is "the", which makes up about 6% of most corpuses (sorry, corpora). But according to these graphs, the most representative word is "toe".
Graphing the distribution of English letters towards the beginning, middle or end of words
(via Hacker News)
This week, the scholarly publishing giant Elsevier filed suit against Sci-Hub and Library Genesis, two sites where academics and researchers practiced civil disobedience by sharing the academic papers that Elsevier claims — despite having acquired the papers for free from researchers, and despite having had them refereed and overseen by editorial boards staffed by more […]
There are lots of transactions that we’re either prohibited from making (selling kidneys), or that are strictly regulated by statute (parental surrogacy). Naturally, these rules are hotly debated, especially among economists, who generally assume that markets of informed buyers and sellers produce outcomes that make everyone better off.
A new working paper [PDF] from three Harvard Business School researchers builds on the work of Texas A&M professor Markus Fitza, whose paper in last month’s Strategic Management Journal showed that nearly everything about a CEO’s performance can be attributed to chance.
The Micro Drone 2.0+ is truly in a league of its own, offering a new perspective on aerial photography, and a world of technological capabilities that make flying ridiculously fun. Simply throw it in the air at any angle and its self-correcting algorithm will stabilize for smooth sailing in no time. You’ll stay entertained with […]
Celebrate Cyber Monday with some brain food. Save on any eLearning deal in the Boing Boing Store today using coupon code: CYBERMONDAY25. Below are a couple of our favorite eLearning offers: eduCBA Tech Training Bundle: Lifetime Subscription:Welcome to your personal online classroom, where you can finally study at your own pace, on your own time (and […]
This minimalist multi-tool will see to it that instead of rocking a tool belt, you’ll carry just one. It’s shaped slightly like a key and weighs less than an ounce, so it plays nice with your keychain. The strong surgical-grade stainless steel blade will last, and is handy for everyday tasks like opening boxes and […]