Prooffreader graphed the distribution of letters towards the beginning, middle and end of English words, using a variety of corpora, finding both some obvious truths and some surprising ones. As soon as I saw this, I began to think of the ways that you could use it to design word games -- everything from improved Boggle dice to automated Hangman strategies to altogether new games.
Now then: I became curious about how letters are placed in English while doing many different, often quick, sometimes pointless, pattern analyses of letters for a wide variety of reasons. (One example: for one art project that will hopefully be posted on this blog one day, I found all the anagrams of "Hollywood", and noticed that words beginning with "w" were overrepresented.)
I've had many "oh, yeah" moments looking over the graphs. For example, words almost never begin with "x", but it's quite common as the second letter. There's a little hump near the beginning of "u" that's caused by its proximity to "q", which is most common at the beginning of a word. When you remove "q" from the dataset, the hump disappears. "F" occurs toward the extremes, especially in prepositions ("for", "from", "of", "off") but rarely just before the middle.
A final thought: the most common word in the English language is "the", which makes up about 6% of most corpuses (sorry, corpora). But according to these graphs, the most representative word is "toe".
Graphing the distribution of English letters towards the beginning, middle or end of words
(via Hacker News)
Edd writes, “I am a professor at Ithaca College in New York. Recently for a research study I tracked almost every Role Playing Game Book circulating in every public and academic library in the world.”
A. Peer, M. Ihmsen, J. Cornelis and M. Teschner’s SIGGRAPH paper “An Implicit Viscosity Formulation for SPH Fluids,” explored techniques for simulating the physics of smoothed-particle hydrodynamics — solids that melt and squoosh into liquids and slimes. As interesting as the paper is, the video is a showstopper — never have simulated anthropomorphic armadillo action-figures […]
Cyagen also makes stem cells and other bio-research materials: they’ll pay academics $100 in vouchers per citation, multiplied by the impact factor of the journal in which the paper is published.
SitePoint Premium is the ultimate e-learning library for web developers, designers, and digital professionals. Famous for their web development books written by industry leaders, they’ve expanded their content library to include in-depth video courses and short, handy screencasts partnering with A Book Apart and UX Mastery. Whatever you want to achieve in your web career, […]
Skip the technical jargon and get right to taking amazing, professional-quality photos with this complete training. The Hollywood Art Institute Photography Course includes 22 modules filled with tutorials on how to profit off of your photography, or simply capture your memories in the manner they deserve.Accredited by the Photography Education Accreditation CouncilDive into this 22 […]
Power up your gadgets in the most unexpected places with the extremely compact SolarJuice battery pack. SolarJuice charges up at home like your average battery pack, but also lets you add extra juice on-the-go using its built-in solar panel—so you’ll never be left unplugged from the digital world.4.5 Stars on Amazon!Simultaneously charges 2 devices at […]