Prooffreader graphed the distribution of letters towards the beginning, middle and end of English words, using a variety of corpora, finding both some obvious truths and some surprising ones. As soon as I saw this, I began to think of the ways that you could use it to design word games -- everything from improved Boggle dice to automated Hangman strategies to altogether new games.
Now then: I became curious about how letters are placed in English while doing many different, often quick, sometimes pointless, pattern analyses of letters for a wide variety of reasons. (One example: for one art project that will hopefully be posted on this blog one day, I found all the anagrams of "Hollywood", and noticed that words beginning with "w" were overrepresented.)
I've had many "oh, yeah" moments looking over the graphs. For example, words almost never begin with "x", but it's quite common as the second letter. There's a little hump near the beginning of "u" that's caused by its proximity to "q", which is most common at the beginning of a word. When you remove "q" from the dataset, the hump disappears. "F" occurs toward the extremes, especially in prepositions ("for", "from", "of", "off") but rarely just before the middle.
A final thought: the most common word in the English language is "the", which makes up about 6% of most corpuses (sorry, corpora). But according to these graphs, the most representative word is "toe".
Graphing the distribution of English letters towards the beginning, middle or end of words
(via Hacker News)
Sara from MIT Sloan Management Review writes, "The entire site is free today through Thursday. To help you make progress on the problems you’re facing right now, they’ve unlocked their site for 72 hours. Every article, research report, and webinar is free to access."
For decades, people (including me) have predicted that cyberinsurers might be a way to get companies to take security seriously. After all, insurers have to live in the real world (which is why terrorism insurance is cheap, because terrorism is not a meaningful risk in America), and in the real world, poor security practices destroy […]
One of the major contributors to greenhouse gases is the methane that cows belch up as they break down cellulose, but five years ago, research from Australia's Commonwealth Scientific and Industrial Research Organisation (CSIRO) found that adding small amounts of a pink seaweed called Asparagopsis to cows' diets eliminated the gut microbes responsible for methane […]
Still using elbow grease to clean the sinks, tubs and other grimy surfaces around your house? Save your elbows, and some time. If you’ve got a power drill, the RevoClean® 4-in-1 Drill Brush Cleaning Kit will instantly turn it into a professional scrubber that can tackle any stain on any surface. Attach the 4″ nylon […]
Need data storage? Join the club. It may still seem like the wild west out there, and for many companies, it’s a tough choice between security and accessibility. Luckily, there’s a platform that gives you a lot of both: Polar Backup Cloud Storage. Whether you’re a busy private citizen or managing valuable company data, Polar […]
There are a lot of different language apps out there because nobody learns anything the same exact way – especially not something as complex as a new language. For some people, the best way is to dive in and start talking, but that’s easier said than done if you’re not around those natives you aspire […]