Open archive of 240,000 hours' worth of talk radio, including 2.8 billion words of machine-transcription

A group of MIT Media Lab researchers have published Radiotalk, a massive corpus of talk radio audio with machine-generated transcriptions, with a total of 240,000 hours' worth of speech, marked up with machine-readable metadata. Read the rest

Scite: a tool to find out if a scientific paper has been supported or contradicted since its publication

The Scite project has a corpus of millions of scientific articles that it has analyzed with deep learning tools to determine whether any given paper has been supported or contradicted by subsequent publications; you can check Scite via the website, or install a browser plugin version (Firefox, Chrome). (Thanks, Josh!) Read the rest

Because Internet: the new linguistics of informal English

Conversational language is not the same as formal language: chatter over the dinner table does not follow the same rules as a speech from a podium. Informal language follows its own fluid, fast-moving rules, and most of what we know about historic informal language has been gleaned from written fragments, like old letters and diaries -- but now, the internet has produced a wealth of linguistic data on informal language, which is explored in Canadian linguist Gretchen McCulloch's new book, Because Internet: Understanding the New Rules of Language. Read the rest

App-based English-language tutors say they frequently witness their Chinese students suffering brutal physical abuse by their parents

There is a booming market for app-based English-language tutors, many in the USA, who serve Chinese families where the parents are eager to have their children acquire English proficiency; these tutors are often also moonlighting teachers, or former teachers, who have been trained to spot and report signs of abuse. Read the rest

How F Scott Fitzgerald conjugated the verb "To cocktail"

F Scott Fitzgerald, in a 1928 letter to Blanche Knopf: "As ‘cocktail,’ so I gather, has become a verb, it ought to be conjugated at least once." (via JWZ) Read the rest

The nine rules of "Freddish": the positive, inclusive empathic language of Mr Rogers

From an excerpt from last year's The Good Neighbor: The Life and Work of Fred Rogers, the rules of "Freddish" -- as Mr Rogers' crewmembers jokingly referred to the rigorous rules that Rogers used to revise his scripts to make them appropriate and useful for the preschoolers in his audience. Read the rest

Self-driving car jargon

Bruce Sterling republishes the acronyms in a recent Daimler white-paper on self-driving cars: Read the rest

What old English perhaps sounded like

In this clip, an Englishman circa 800 A.D. is asked to chatter about his life. He understands the eallníwe léasspellung but prefers the old talk.

A fun little thing to show reconstructed pronunciation of textbook Old English in a casual setting. I've tried to throw in a few natural abbreviations (for example 'c rather than ic), but I know I missed the mark on one or two of the diphthongs. Either way, hopefully this gives some idea as to how the language sounded in casual speech. Message or comment if you'd like any clarifications, want to correct me on anything, or if you're just interested in the topic and would like to know more! I didn't have any decent Anglo-Saxon clothing...

Read the rest

Movie theater changed "Hellboy" to "Heckboy" on marquee

The Roxy 8 Movie Theater in Dickson, Tennessee changed the title of Hellboy to Heckboy on its marquee. From WZTV:

(Owner Belinda) Daniel told FOX 17 News that she has never displayed any words on the sign that may be seen as profanity, especially since the Roxy is next to Oakmont Elementary School...

“As it turned out, our play on words became a little more exciting than we expected,” Daniel said. “We are glad that we could share a small bit of our great community while also sharing a laugh with the rest of the world...”

Daniel said the sign is the only place where the movie’s title was changed. It appears as “Hellboy” both on the theater’s website, and on the billboards posted on the front of the theater.

Read the rest

A madlibs science fiction plot generator

Grether Labs's Science Fiction Plot Generator can sure pick 'em: "You are friends with a talking fireplace, and you are working to solve this ancient puzzle before the creatures consume you"; "You are a cyan-eyed cartographer who is finding the awful truth beneath this false utopia, and who is struggling with the terribly thick underbrush and terrible isolation"; "You are friends with a penniless government agent, and you are working to gather the spice before the computer system becomes self-aware"; "You are a science fiction writer and activist who has been made obsolete by a small perl script." Read the rest

Soap for grammar police

The Whiskey River Soap Company's funny soap varieties mostly fall flat for me, but there's one exception: the Grammar Police edition. (Thanks, Fipi Lele!) Read the rest

Watch French people try to say difficult English words

Hitting them with "Throughout" first is pretty sadistic. But that they stumble on "choir" suggests that they are hamming it up, un peu? Read the rest

Some pretty impressive machine-learning generated poetry courtesy of GPT-2

GPT-2 is Open AI's language-generation model (last seen around these parts as a means of detecting machine-generated text); it's powerful and cool, and Gwern Branwen fed it the Project Gutenberg poetry corpus to see what kind of poetry it would write. Read the rest

Phonetically consistent English

English is a dragon of a language, dozing atop an enormous mountain of phonemes. What if they were all melted down and minted into something more consistent? And then we tried to speak it? The results sound a bit like a Welsh accent. [YouTube] Read the rest

"We take your privacy and security seriously" is the "thoughts and prayers" of data-breaches

Writing on Techcrunch, Zack Whittaker (previously) calls out the timeworn phrase "we take your privacy and security seriously," pointing out that this phrase appears routinely in company responses to horrific data-breaches, and it generally accompanied by conduct that directly contradicts it, such as stonewalling and minimizing responsibility for breaches and denying their seriousness. "We take your privacy and security seriously" is really code for "Please stop asking us to take your privacy and security seriously." Read the rest

Dialect quiz tracks down where you grew up

I was easy to locate because the term "Had" for the game "Tag" puts my childhood very precisely in Worthing, England, right by Brighton in this map. But it also knows I spent two years in Essex. The NYT's British-Irish dialect quiz is a sharp application of science. The U.S. version was published a while back. Read the rest

How to: make up swears

The power of fuckbonnet, shitsquib, fuckstumbling, douchenozzle, Fuckface von Clownstick, shitwhistle, cockbucket, can be captured through a simple formula: the "pyrrhic foot" of a "familiar profanity compounded with a non-profane word of two unaccented syllables." Read the rest

More posts