Vulgar constructs languages for fantasy fiction or whatever other purpose you can imagine, applying consistent rules to the custom phonemes you feed it. [via]

Vulgar's output models the regularities, irregularities and quirks of real world languages; phonology, grammar, and a 2000 unique word vocabulary. Trial the demo version online. Purchase the premium version to get access to the complete 2000 word output (with derivational words) and extra grammatical rules. ...

Vulgar generates ... based on a list of some of English's most common words. However, the program is more than just a one-to-one mapping of unique outputs to English words. In an effort to mimic real world languages, Vulgar also creates various homphones and overlapping senses inspired by examples from real world languages. For example:

Here's my language:

The Language of Puput /ˈpʰupʰutʰ/

...and he stood holding his hat and turned his wet face to the wind. ...u lu bunela une luch yafa u neba luch miku peb tul ye Pronunciation: /u lu bɯˈnela ˈune løtʃ ˈjafa u ˈneba løtʃ ˈmikʰø pʰeb t̪øl je/ Narrow pronunciation: [u lu bɯˈnela ˈune løtʃ ˈjafa u ˈneba løtʃ ˈmikʰø pʰe t̪øl je] Puput structure: and he stood holding his hat and turned his wet face the wind to

Seed for this language: 0.36384689368800394

Synthetic linguists file a new brief (with Klingon passages!) about Paramount's fan-film crackdown

Society of synthetic linguists explain to court, in Klingon, why Klingon shouldn't be copyrightable

How To: Pronounce Nelson Mandela's middle name

Rise of the Valleyguy

Animated history of the English language

If you've got 10 minutes, you can learn the history of English — including some interesting background on where specific words and phrases came from. (If you don't have 10 minutes, you can also watch the whole thing one chapter at a time in less-than-two-minute segments.) Interesting note: The equal importance of both The King James Bible and early scientific publications/societies to the formation of English as we speak it today.

Help fund a magazine dedicated to language geekery

Listen to a story told in a 6000-year-old extinct language

English — along with a whole host of languages spoken in Europe, India, and the Middle East — can be traced back to an ancient language that scholars call Proto Indo-European. Now, for all intents and purposes, Proto Indo-European is an imaginary language. Sort of. It's not like Klingon or anything. It is reasonable to believe it once existed. But nobody every wrote it down so we don't know exactly what "it" really was. Instead, what we know is that there are hundreds of languages that share similarities in syntax and vocabulary, suggesting that they all evolved from a common ancestor.

Lexicon: smart, sharp technothriller from Max "Jennifer Government" Barry

Max Barry's new technothriller Lexicon is a gripping conspiracy novel about a cabal of "poets" who have mastered the deep language of the human brain and can use it to boss the rest of us around. It's a pitch-perfect thriller, a jetpack of a plot that rocketed me from page one to page 400 in a single afternoon, and it kept me guessing right up to the end. Imagine Dan Brown written by someone a lot smarter and better at characterization and at hand-waving the places where the science shades into science fiction, and you've got something like Lexicon.

In particular, Lexicon captures a lot of the stuff that makes the myth of Neurolinguistic Programming so compelling -- the idea that smart people can figure out how to make others march in lockstep just by tricking their subconsciouses into thinking that that's what they wanted to do all along. And Barry carries through the power-fantasy to its inescapable end: a secretive, paranoid, power-maddened cabal that is its own worst enemy.

Full of surprises and grace notes, this is the kind of delightful thriller that's anything but a guilty pleasure, and just what you'd expect from the author of such great books as Jennifer Government and Machine Man.

The BBC discovers the Texas Germans — and a dying dialect

Meet the random shopper: Amazon gifts bought at a machine's whim

Boston coder Darius Kazemi's interest in chance led him to create a bot that buys stuff on Amazon: a human decision made ineluctably alien by the randomness of a computer's whim.

LOL linguist

I can haz transformational grammar?

Saving dying languages with the help of math

Languages come and go and blend. It's likely been that way forever and the process only accelerates under the influence of mega-languages (like English) that represent a sort of global means of communication. But, increasingly, people who are at risk of losing their native language entirely are fighting back—trying to encourage more people to be bilingual and save the native language from extinction.

At Discover Magazine, Veronique Greenwood has a really interesting story about a mathematician who is helping to preserve Scottish Gaelic. How? The researcher, Anne Kandler, has put together some equations that can help native language supporters target their programs and plan their goals.

Some of the numbers are obvious—you must know how many people in the population you’re working with speak just Gaelic, how many speak just English, and how many are bilingual, as well as the rate of loss of Gaelic speakers. But also in the model are numbers that stand for the prestige of each language—the cultural value people place on speaking it—and numbers that describe a language’s economic value.

Put them all together into a system of equations that describe the growth of the three different groups—English speakers, Gaelic speakers, and bilinguals—and you can calculate what inputs are required for a stable bilingual population to emerge. In 2010, Kandler found that using the most current numbers, a total of 860 English speakers will have to learn Gaelic each year for the number of speakers to stay the same. To her, this sounded like a lot, but the national Gaelic Development Agency was pleased: it’s about the number of bilingual speakers they were already aiming to produce through classes and programs.

Why certain phrases are memorable

You had me at hello: How phrasing affects memorability, a clever study of "memorable phrases" from movies and advertisements from Cristian Danescu-Niculescu-Mizil, Justin Cheng, Jon Kleinberg, Lillian Lee at Cornell attempts to uncover why certain phrases become part of our collective history.

The results are interesting. The phrases themselves turn out to be significantly distinctive, meaning they're made up of combinations of words that are unlikely to appear in the corpus. By contrast, memorable phrases tend to use very ordinary grammatical structures that are highly likely to turn up in the corpus.

They also found that memorable phrases tend to use pronouns (other than you), the indefinite article a rather than the definite article the, and verbs in the past rather than present tense. These are all features that tend to make phrases general rather than specific.

So memorable phrases contain generic pearls of wisdom expressed with unusual combinations of words in ordinary sentences.

Passphrases suck less than passwords, but they still suck

In "Linguistic properties of multi-word passphrases" (PDF, generates an SSL error) Cambridge's Joseph Bonneau and Ekaterina Shutova demonstrate that multi-word passphrases are more secure (have more entropy) than average user passwords composed of "random" characters, but that neither is very secure. In a blog post, Joseph Bonneau sums up the paper and the research that went into it.

Some clear trends emerged—people strongly prefer phrases which are either a single modified noun (“operation room”) or a single modified verb (“send immediately”). These phrases are perhaps easier to remember than phrases which include a verb and a noun and are therefore closer to a complete sentence. Within these categories, users don’t stray too far from choosing two-word phrases the way they’re actually produced in natural language. That is, phrases like “young man” which come up often in speech are proportionately more likely to be chosen than rare phrases like “young table.”

This led us to ask, if in the worst case users chose multi-word passphrases with a distribution identical to English speech, how secure would this be? Using the large Google n-gram corpus we can answer this question for phrases of up to 5 words. The results are discouraging: by our metrics, even 5-word phrases would be highly insecure against offline attacks, with fewer than 30 bits of work compromising over half of users. The returns appear to rapidly diminish as more words are required. This has potentially serious implications for applications like PGP private keys, which are often encrypted using a passphrase.

Every writing system, ever, pretty much

Omniglot is an intimidatingly complete site devoted to cataloging every writing system that ever existed. As JoshP says, "If you ever need to transliterate Punic... this is the place."

Linguistics, Turing Completeness, and teh lulz

Yesterday's keynote at the 28th Chaos Computer Congress (28C3) by Meredith Patterson on "The Science of Insecurity" was a tour-de-force explanation of the formal linguistics and computer science that explain why software becomes insecure, and an explanation of how security can be dramatically increased. What's more, Patterson's slides were outstanding Rageface-meets-Occupy memeshopping. Both the video and the slides are online already.

Hard-to-parse protocols require complex parsers. Complex, buggy parsers become weird machines for exploits to run on. Help stop weird machines today: Make your protocol context-free or regular!

Protocols and file formats that are Turing-complete input languages are the worst offenders, because for them, recognizing valid or expected inputs is UNDECIDABLE: no amount of programming or testing will get it right.

A Turing-complete input language destroys security for generations of users. Avoid Turing-complete input languages!

Patterson's co-authors on the paper were her late husband, Len Sassaman (eulogized here) and Sergey Bratus.

