My great-grandmother, Hedwig Nietzsche Koerth, never spoke English. My Grandpa Gustav didn't learn the language until he entered first grade. But, by the time I was in grade school — and was going through a brief fling of learning German — Grandpa no longer remembered much of what had once been his first language. Today, nobody in my immediate family speaks any German, much less the dying dialect of Texas German that my great-grandmother spoke.
The BBC has an interesting story about the history and linguistics of Texas German, which will probably die out in the next couple generations — largely because the German Germans started a couple world wars in a row and changed the idea of what was and wasn't socially acceptable speech in America.
— Maggie
•
Leigh Alexander at 8:55 am •
•
Workers fulfull orders at an Amazon warehouse in Rugeley, England. REUTERS/Phil Noble
What would a bot buy from Amazon, if given life—and a gift card loaded with credit? Noam Chomsky's Cartesian Linguistics, apparently.
It's hard to believe that'd be a random choice, but it is, coming from a creature engineered for randomness by a man fascinated with randomness -- and consumerism. My friend Darius Kazemi, Boston-based developer extraordinaire, has a long-held interest in randomness. He's made the Twitter account @metaphorminute, designed to tweet a random metaphor every couple minutes, and OutSlide, which generates a random set of slides based on phrase-oriented Google image results.
With a background primarily in games, he's always been drawn to roguelikes and other games where random generation is a factor in the experience; he's attracted to the idea of "abdicating design decisions to a computer."
Read the rest
Maggie Koerth-Baker at 9:04 am •
•

I can haz transformational grammar?
Via Justin Bernacki and Trust me, I'm a linguist.
Maggie Koerth-Baker at 10:40 am •
•

Languages come and go and blend. It's likely been that way forever and the process only accelerates under the influence of mega-languages (like English) that represent a sort of global means of communication. But, increasingly, people who are at risk of losing their native language entirely are fighting back—trying to encourage more people to be bilingual and save the native language from extinction.
At Discover Magazine, Veronique Greenwood has a really interesting story about a mathematician who is helping to preserve Scottish Gaelic. How? The researcher, Anne Kandler, has put together some equations that can help native language supporters target their programs and plan their goals.
Some of the numbers are obvious—you must know how many people in the population you’re working with speak just Gaelic, how many speak just English, and how many are bilingual, as well as the rate of loss of Gaelic speakers. But also in the model are numbers that stand for the prestige of each language—the cultural value people place on speaking it—and numbers that describe a language’s economic value.
Put them all together into a system of equations that describe the growth of the three different groups—English speakers, Gaelic speakers, and bilinguals—and you can calculate what inputs are required for a stable bilingual population to emerge. In 2010, Kandler found that using the most current numbers, a total of 860 English speakers will have to learn Gaelic each year for the number of speakers to stay the same. To her, this sounded like a lot, but the national Gaelic Development Agency was pleased: it’s about the number of bilingual speakers they were already aiming to produce through classes and programs.
Read the rest at Discover Magazine
Image: Gaelic Signs, a Creative Commons Attribution (2.0) image from cradlehall's photostream
Cory Doctorow at 10:50 am •
•
You had me at hello: How phrasing affects memorability, a clever study of "memorable phrases" from movies and advertisements from Cristian Danescu-Niculescu-Mizil, Justin Cheng, Jon Kleinberg, Lillian Lee at Cornell attempts to uncover why certain phrases become part of our collective history.
The results are interesting. The phrases themselves turn out to be significantly distinctive, meaning they're made up of combinations of words that are unlikely to appear in the corpus. By contrast, memorable phrases tend to use very ordinary grammatical structures that are highly likely to turn up in the corpus.
They also found that memorable phrases tend to use pronouns (other than you), the indefinite article a rather than the definite article the, and verbs in the past rather than present tense. These are all features that tend to make phrases general rather than specific.
So memorable phrases contain generic pearls of wisdom expressed with unusual combinations of words in ordinary sentences.
The Secret Science of Memorable Quotes
Cory Doctorow at 3:24 pm •
•
In "Linguistic properties of multi-word passphrases" (PDF, generates an SSL error) Cambridge's Joseph Bonneau and Ekaterina Shutova demonstrate that multi-word passphrases are more secure (have more entropy) than average user passwords composed of "random" characters, but that neither is very secure. In a blog post, Joseph Bonneau sums up the paper and the research that went into it.
Some clear trends emerged—people strongly prefer phrases which are either a single modified noun (“operation room”) or a single modified verb (“send immediately”). These phrases are perhaps easier to remember than phrases which include a verb and a noun and are therefore closer to a complete sentence. Within these categories, users don’t stray too far from choosing two-word phrases the way they’re actually produced in natural language. That is, phrases like “young man” which come up often in speech are proportionately more likely to be chosen than rare phrases like “young table.”
This led us to ask, if in the worst case users chose multi-word passphrases with a distribution identical to English speech, how secure would this be? Using the large Google n-gram corpus we can answer this question for phrases of up to 5 words. The results are discouraging: by our metrics, even 5-word phrases would be highly insecure against offline attacks, with fewer than 30 bits of work compromising over half of users. The returns appear to rapidly diminish as more words are required. This has potentially serious implications for applications like PGP private keys, which are often encrypted using a passphrase. Users are clearly more random in “passphrase English” than in actual English, but unless it’s dramatically more random the underlying natural language simply isn’t random enough. Exploring this gap is an interesting avenue for future collaboration between computer security researchers and linguists. For now we can only be comfortable that randomly-generated passphrases (using tools like Diceware) will resist offline brute force.
Some evidence on multi-word passphrases
(via Schneier)
Cory Doctorow at 6:00 am •
•

Omniglot is an intimidatingly complete site devoted to cataloging every writing system that ever existed. As JoshP says, "If you ever need to transliterate Punic... this is the place."
Omniglot - the guide to languages, alphabets and other writing systems
Cory Doctorow at 11:35 pm •
•

Yesterday's keynote at the 28th Chaos Computer Congress (28C3) by Meredith Patterson on "The Science of Insecurity" was a tour-de-force explanation of the formal linguistics and computer science that explain why software becomes insecure, and an explanation of how security can be dramatically increased. What's more, Patterson's slides were outstanding Rageface-meets-Occupy memeshopping. Both the video and the slides are online already.
Hard-to-parse protocols require complex parsers. Complex, buggy parsers become weird machines for exploits to run on. Help stop weird machines today: Make your protocol context-free or regular!
Protocols and file formats that are Turing-complete input languages are the worst offenders, because for them, recognizing valid or expected inputs is UNDECIDABLE: no amount of programming or testing will get it right.
A Turing-complete input language destroys security for generations of users. Avoid Turing-complete input languages!
Patterson's co-authors on the paper were her late husband, Len Sassaman (eulogized here) and Sergey Bratus.
LANGSEC explained in a few slogans
Maggie Koerth-Baker at 9:34 am •
•
On Submitterator, Musicman pointed me towards this great presentation on LOLspeak as a form of language play, and why people engage in that play. According to Lauren Gawne, who gave this speech last week at the Australian Linguistics Society conference, the choice to use LOLspeak has a lot to do with establishing identity—the playful identity of "cat", and the serious identity of "knowledgeable Internet user".
Includes an explanation of why LOLspeak is language play and not some language mashup "kitty pidgin".
You can read more about this on Lauren Gawne's blog Superlinguo.
The video, by the way, is 20 minutes long. It's also got a little bit of weird, warbly feedback in the audio, but that doesn't get in the way of hearing what Gawne is saying.
Maggie Koerth-Baker at 8:55 am •
•
Last year, I stumbled across some of the cool history of American Sign Language, documenting how it evolved out of both formal and informal languages—systems Deaf children used to communicate at home, and the systems they were taught as Deaf schools drew diverse groups from a wide geographical range. For American Sign Language, this process happened in the 19th century. In other parts of the world, it's still ongoing. For instance, in Nicaragua, Deaf people who are in school now are learning a much more formalized language, with a much bigger vocabulary, than those who went to school in the 1980s.
Those international differences are fascinating to me, so I'm really pleased to find this post on the Sinosplice blog, discussing the Chinese system of finger spelling. The blogger there is a linguist, so there's a lot of neat perspective in the linked post and others on the linguistic mechanics of finger spelling and sign language in China.
Finger spelling is very different from a sign language. In a sign language, you'd have one hand movement or hand position that stands for the concept "bird." In finger spelling, you'd have several different movements/positions for each letter or sound of the word "bird." You probably picked up some American finger spelling from Sesame Street, it's likely to at least look somewhat familiar. But the really cool thing about this post, is that it contrasts that system with the finger spelling alphabets used in Russia, Japan, and several that have been used historically in China. That's the US system above. Below, the modern Chinese system that corresponds to the pinyin, a way of transcribing printed Chinese words into Roman letters.
Via Kerim Friedman
Cory Doctorow at 5:12 pm •
•
Writing for the OED, Stefan Dollinger (director of the Canadian English Lab, University of British Columbia at Vancouver) provides indispensable notes on talking Canadian:
We can find the linguistic expression of the Canadian east-west connection at all linguistic levels. Vowels, for instance, love to change but when they change in Canada they have been shown to rarely – for some changes never—to cross the Canada-US border. For example, the ‘Canadian shift’, first detected in the mid 1990s, affects the ‘short front vowels’, i.e. the three vowels exemplified in black, pen or tin. In Canada these vowels move in the opposite direction to the well-established ‘Northern Cities Shift’ in parts of the United States. So in Canada, the vowel in black, for instance, is pronounced farther back in the mouth. Canadian dialects are actually diverging from the American dialects that have experienced the shift, and this despite the high levels of interaction between the two countries.
Other features include ‘Canadian raising’, the most-widely known Canadian pronunciation feature. Canadian raising affects the diphthongs in words such as wife, price or life and house, about or shout. Canadian pronunciations, though far from universal, are often perceived as weef instead of wife and a boot instead of about by outsiders. There are also other, less well-known Canadian differences, such as the Canadian integration pattern of foreign sounds represented by<a>. In words like pasta, lava, plaza, and drama the foreign <a> sound acquires the vowel in father in American English and British English, but the vowel of cat in Canadian English.
Canadian English
(Image: Canada, a Creative Commons Attribution (2.0) image from alexindigo's photostream)