Prooffreader graphed the distribution of letters towards the beginning, middle and end of English words, using a variety of corpora, finding both some obvious truths and some surprising ones. As soon as I saw this, I began to think of the ways that you could use it to design word games -- everything from improved Boggle dice to automated Hangman strategies to altogether new games.
Now then: I became curious about how letters are placed in English while doing many different, often quick, sometimes pointless, pattern analyses of letters for a wide variety of reasons. (One example: for one art project that will hopefully be posted on this blog one day, I found all the anagrams of "Hollywood", and noticed that words beginning with "w" were overrepresented.)
I've had many "oh, yeah" moments looking over the graphs. For example, words almost never begin with "x", but it's quite common as the second letter. There's a little hump near the beginning of "u" that's caused by its proximity to "q", which is most common at the beginning of a word. When you remove "q" from the dataset, the hump disappears. "F" occurs toward the extremes, especially in prepositions ("for", "from", "of", "off") but rarely just before the middle.
A final thought: the most common word in the English language is "the", which makes up about 6% of most corpuses (sorry, corpora). But according to these graphs, the most representative word is "toe".
Graphing the distribution of English letters towards the beginning, middle or end of words
(via Hacker News)
A group of computer scientists from Tsinghua University, Tencent and Tsinghua National Laboratory for Information Science and Technology have posted a first-of-its-kind paper to Arxiv, analyzing the problems that make connecting to wifi networks so achingly slow.
In a new paper in Progress, Oxford economist Vuk Vukovic argues that the key to re-election in local politics is to be just corrupt enough: giving lucrative contracts and other benefits to special interests who’ll fund your next campaign, but not so much that the people refuse to vote for you.
The Harvard Institute for Quantitative Science team that published 2016’s analysis of the Chinese government’s ’50c Party’, who flood social media with government-approved comments has published a new paper, How the Chinese Government Fabricates Social Media Posts for Strategic Distraction, not Engaged Argument, in which they reveal their painstaking analysis of a huge trove of […]
Whether I’m trying to relieve some stress at work or entertain myself on the metro, Space Putty is there. You can bring this magical goo home and try it for yourself for just $9.99Like Silly Putty of yesteryear, this viscoelastic substance can be molded into different shapes and stretched around in your hands. Use it […]
You know as well as I that writing complex, long-long form text requires significant organization. You’re probably also well aware that Word just isn’t up to the task. That’s why I’m a huge fan of Scrivener, the software suite used by best-selling authors and technical writers alike.Scrivener is much more than another digital typewriter. With a […]
Looking to upgrade your weekend? Here are three randomly awesome products on my mind this week.#3 FRESHeBUDS Pro Magnetic Bluetooth EarbudsAs more and more phones and gadgets switch to Bluetooth-only compatibility, you’ll need to get Bluetooth headphones like the rest of us. I’ve been super impressed with these affordable magnetic headphones. Pull the magnetic earbuds apart to auto-connect […]