Boing Boing 

Explaining the banned phrases in a Chinese microblogging client

LINE is a Twitter-like service chat app popular in Asia. @hirakujira discovered that its Chinese-language client, Lianwo, had a file listing 24 150 forbidden phrases that set to trigger an error reading "Your message contains sensitive words, please adjust and send again" (though this was not yet enabled). The Blocked on Weibo Tumblr has begun a series of posts listing every one of these banned phrases and explaining their context -- for example, Zhejiang’s receipt-signing Brother (浙江签单哥) refers to an embezzlement scandal involving Zhejiang's Vice Minister of Propaganda.

Read the rest

Understanding NSA boss James Clapper's France-spying "denial"

NSA boss James Clapper has officially responded to the allegations that the agency intercepted 70,000,000 French phone calls with a narrowly worded, misleading denial. Tim Cushing at Techdirt does us the kremlinological service of finely parsing the NSA word-game and showing us what Clapper doesn't deny:

Read the rest

Why Q was illegal in Turkey until last month

Last month, Turkey repealed its 1928 Alphabet Law, and legalized the letter Q. In a short, illuminating piece in the London Review of Books, Yasmine Searle describes the history of Romanicization of Turkish writing, which was part of a larger project to assimilate Turkish minorities by standardizing the language and its spelling, and, in the process, banning many of the keys from the left side of the typewriter.

Read the rest

Tea Party insult generator

The collapse of the GOP-engineered shutdown has the Tea Party in a fury, and they're showing their wrath with a series of vicious posts to John Boehner's Facebook. The Tea Party Insult Generator teases these insults apart and recombines them to make them stronger, faster, better than before.

Read the rest

Listen to a story told in a 6000-year-old extinct language

English — along with a whole host of languages spoken in Europe, India, and the Middle East — can be traced back to an ancient language that scholars call Proto Indo-European. Now, for all intents and purposes, Proto Indo-European is an imaginary language. Sort of. It's not like Klingon or anything. It is reasonable to believe it once existed. But nobody every wrote it down so we don't know exactly what "it" really was. Instead, what we know is that there are hundreds of languages that share similarities in syntax and vocabulary, suggesting that they all evolved from a common ancestor.

Of course, that very quickly leads to attempts to reconstruct what said ancestral language might have sounded like. In the track above, you can listen to University of Kentucky linguist Andrew Byrd recite a fable in reconstructed Proto Indo-European. Archaeology magazine helpfully provides a translation:

Read the rest

Extreme Puritan baby-naming


As found in Curiosities of Puritan nomenclature (1888), a collection of Puritan names chosen "to remind the child about sin and pain." My favorite? "Kill-sin Pimple."

Read the rest

Decoding NSA doublespeak

The Electronic Frontier Foundation's Trevor Timm has a handy guide to decoding NSA doublespeak. The spookocracy has a pathetically transparent way of lying their way out of direct questions, but the press (and, more importantly, Congress) seems incapable of detecting the low-grade BS emanating from the smoke-filled rooms. For example, when you ask the NSA if they can read Americans' email without a warrant, they reply "we cannot target Americans’ email without a warrant." The amazing thing about this stuff isn't that the NSA tries it on, but that its nominal supervision doesn't notice it. My five year old is better at this than they are.

This makes a great addition to the glossary of NSAspeak compiled by the ACLU's Jameel Jaffer and Brett Max Kaufman.

Read the rest

Unsupervised AI makes up some pretty funny jokes

Unsupervised joke generation from big data [PDF], a paper by University of Edinburgh researchers Sasa Petrovic and David Matthews, describes an ingenious and successful method for teaching a computer to make up jokes like "I like my relationships like I like my source, open;" "I like my coffee like I like my war, cold;" and "I like my boys like I like my sectors, bad." The researchers wrote code that called on Google's n-gram database to find noun-attribute pairs, zero in on nouns with ambiguous meaning, and automatically generate jokes.

Read the rest

More on the NSA's weird, deceptive, indefensible definition of "targeted surveillance"

The Electronic Frontier Foundation's Mark Rumold has detailed analysis of yesterday's story about the bizarre, misleading way that the NSA uses the word "targeted" in discussions of "targeted surveillance." It comes down to this: the NSA and its defenders continue to claim that the organization only spies on foreigners when they're off US soil, and not on Americans or people in America (why this should comfort those of us who are neither Americans nor in America is a mystery to me).

But they harvest every word read and written on the Internet, including private communications, and scan it to see if it matches the name of someone they're looking at -- say, Vladimir Putin. Anyone whose communications contain the name or other details of the foreign target can also be spied upon, and the NSA says this doesn't constitute domestic spying. So they're not spying on all Americans, just every American who's ever mentioned the name of a foreigner -- and to accomplish this, they read every word everyone writes, but they're using a computer to do it, so it doesn't count.

Read the rest

Making sense of "Beanish," XKCD's synthetic language


As was noted, the amazing, 3,000+ installment XKCD story Time featured a synthetic language (with its own script) created by a linguist for the story. Deciphering Beanish is a blog where the language is being slowly, surely made legible.

Read the rest

Astounding backstory behind XKCD's "Time"


A week ago, Randall Munroe finished "Time", XKCD's long, running, slow-updating, 3,000+ frame comic telling the story of two people who discover an impending superflood that would destroy their society. Randall's explained in detail what was going on there, from the geology of the thing (it's set millennia in the future, amid a civilization denied the ability to jumpstart itself by the paucity of remaining fossil fuels, and the flood is modelled on a real event that sealed off the Mediterranean Sea five million years ago) to the fictional language the upland culture speaks (designed by a linguist, and still mysterious).

Read the rest

NSA's new meanings for common terms

The ACLU's Jameel Jaffer and Brett Max Kaufman have compiled a NSA lexicon, listing the made-up, nonsensical meanings that the NSA has assigned to common words, in order to defend their criminality. For example:

Surveillance. Every time we pick up the phone, the NSA makes a note of whom we spoke to, when we spoke to him, and for how long—and it’s been doing this for seven years. After the call-tracking program was exposed, few people thought twice about attaching the label “surveillance” to it. Government officials, though, have rejected the term, pointing out that this particular program doesn’t involve the NSA actually listening to phone calls—just keeping track of them. Their crabbed definition of “surveillance” allows them to claim that the NSA isn’t engaged in surveillance even when it quite plainly is.

Read the rest

You're vs Your song

Jonathan Mann sez, "The internet abounds with people misusing Your and You're. I wanted to write a simple catchy tune to help them remember which is which!"

Your doing important work, sir.

Your vs. You're Song (Song A Day #1655) (Thanks, Jonathan!)

Apple's mobile devices have a secret list of "sensitive" words that don't autocomplete


The Daily Beast investigated the autocomplete on Apple Ios devices (Iphones, Ipads, etc), and discovered that there was a long list of "sensitive" words that the devices have in their dictionary but would not autocomplete -- you would have to type them out in full to get them into your device. This list includes words such as "abortion," "rape," "ammo," and "bullet." They documented their methodology in detail.

Read the rest

On the language of comic strips

Since moving to Fast Company's, John Brownlee's been on a roll: everything he's written for them is ace. The latest is Quimps, Plewds and Grawlixes: The Secret Language of Comic Strips, a review of Mort Walker's obscure 1980s Lexicon on Comicana.
To Walker, understanding the design language of the comics was important. Cartooning is usually one of the first means of written expression a child learns, and for Walker, understanding the language of cartooning was the key to communicating with other people in an increasingly international world.

What is a twink?

The word "Twink" used to mean something. Now it's just another term of abuse for queer men, an "easy shorthand for vicious stereotypes." [The Awl via Metafilter]

Euphonia: a mechanical talking machine


Here's a delicious potted history of the Euphonia, a mid-19th century gadget that could simulate human speech by pumping bellows-fed air over an artificial tongue set in a chamber of weird plates and valves. It had a severe woman's face and coils of hair in ringlets, and spoke in a "weird, ghostly monotone."

By pumping air with the bellows and manipulating a series of plates, chambers, and other apparatus, including an artificial tongue, the operator could make it speak any European language. It was even able to sing the anthem God Save the Queen. The Euphonia was invented in 1845 by Joseph Faber, a German immigrant. A little known fact is that this machine greatly influenced the invention of the telephone.

The Euphonia - A Marvelous Talking-Machine (Curious History via Kadrey)

Florida bans computers

Florida tried to ban Internet Cafes that were functioning as unlicensed casinos, but may have banned smartphones and computers instead, due to language that defines slot machines as "any machine or device or system or network of devices" that can be used in connection with games of chance. I question the legitimacy of shutting down all Internet Cafes in the first place, but this is clearly an overbroad definition, as has been pointed out in a suit challenging the law, brought by an Internet Cafe owner in Miami -- ironic, as Florida is the state whose law once took over 100 words to precisely define "buttocks."

German language now officially includes "shitstorm"

"Shitstorm" has been inducted into Duden, the official German dictionary. It was a favorite among linguists, who admired its applicability to the plagiarism scandal that led to the resignation of Defence Minister Karl-Theodor zu Guttenberg. Strangely, an equivalent German word was not created by stringing together 75 other German words. (via The Mary Sue)

Website filler text based on the works of William Gibson

Need some filler text for a design? Screw Lorem Ipsum, go Lorem Gibson, "Website filler text based on the works of William Gibson." ::: smart- computer fluidity knife marketing modem apophenia faded corrupted marketing j-pop post- decay stimulate. woman fluidity j-pop Chiba Tokyo youtube corporation -space semiotics tower face forwards monofilament semiotics bomb. rebar nodal point free-market spook cardboard cartel garage futurity dead saturation point boy pen gang narrative ::: (via Bruce Sterling)

The BBC discovers the Texas Germans — and a dying dialect

My great-grandmother, Hedwig Nietzsche Koerth, never spoke English. My Grandpa Gustav didn't learn the language until he entered first grade. But, by the time I was in grade school — and was going through a brief fling of learning German — Grandpa no longer remembered much of what had once been his first language. Today, nobody in my immediate family speaks any German, much less the dying dialect of Texas German that my great-grandmother spoke. The BBC has an interesting story about the history and linguistics of Texas German, which will probably die out in the next couple generations — largely because the German Germans started a couple world wars in a row and changed the idea of what was and wasn't socially acceptable speech in America.

"Citation needed"'s Wikipedia entry

Regrettably, the Wikipedia entry for "Citation needed" ("a common editorial remark on Wikipedia, which has become used to refer to Wikipedia in wider popular culture") doesn't include any actual assertions tagged with [citation needed].

On July 4, 2007, the webcomic xkcd published a comic which depicted a protestor holding up a "citation needed" sign during a political speech.[7]

In late 2010, banners with the template appeared at the somewhat tongue-in-cheek Rally to Restore Sanity and/or Fear,[8] and in February 2011, at a more serious demonstration in Berlin against German defence minister Karl-Theodor zu Guttenberg, who had been embroiled in a scandal after it was discovered he had plagiarised portions of his doctoral thesis.[9]

The New York Times has commented on the propensity of some "stickler editors" for adding the template to unattributed facts,[10] and has used the phrase in an online headline.[11]

Citation needed (via JWZ)

Official list of English words misused in EU documents

A brief list of misused English terminology in EU publications [PDF] is a fascinating look at the emerging dialect of English that is emerging out of the EU bureaucracy, in which odd bureaucratic language has to be translated from and to many languages. It's a good window into concepts that are common in one nation's bureaucratic tradition, but not others':

Dispose (of)
Explanation: the most common meaning of ‘dispose of’ is ‘to get rid of’ or ‘to throw away’; it never means ‘to have’, ‘to possess’ or ‘to have in one’s possession’. Thus, the sentence ‘The managing authority disposes of the data regarding participants.’ does not mean that it has them available; on the contrary, it means that it throws them away or deletes them. Similarly, the sentence below does not mean: ‘the Commission might not have independent sources of information’, it means that the Commission is not permitted to discard the sources that it has.

Example: ‘The Commission may not be able to assess the reliability of the data provided by Member States and may not dispose of independent information sources (see paragraph 39)46.’

As Bruce Sterling says, "I would not expect 'Brussels English' to get any closer to grammatically correct British English; on the contrary I would expect it in future to drift into areas of machine translation jargon, since that’s a lot cheaper than hiring human translators who are as skilled as the author of this document."

Web Semantics: Brussels English

Skeuomorphism, Apple, and Ricardo Montalbán's favorite station wagon

Lebaronnnn

Over at Apple, Jony Ive is reportedly pulling back on the skeuomorphism for iOS 7. I'm glad. I don't care for skeuomorphism except in a very few instances, like the 1982 Chrysler Town & Country seen above with Ricardo Montalbán.

Slash: a new conjunction

"Slash" has emerged as a new conjunction, which is a rarity in slang. I love the fact that people spell out "slash" and then hyphenate it, and find it hard to believe that they're not doing this for the sheer delightful absurdity of it all:

...But for at least a good number of students, the conjunctive use of slash has extended to link a second related thought or clause to the first with a meaning that is often not quite “and” or “and/or” or “as well as.” It means something more like “following up.” Here are some real examples from students:

7. I really love that hot dog place on Liberty Street. Slash can we go there tomorrow?
8. Has anyone seen my moccasins anywhere? Slash were they given to someone to wear home ever?
9. I’ll let you know though. Slash I don’t know when I’m going to be home tonight
10. so what’ve you been up to? slash should we be skyping?
11. finishing them right now. slash if i don’t finish them now they’ll be done in first hour tomorrow

The student who searched her Facebook chat records found instances of this use of slash as far back as 2010. (When I shared a draft of this post with the students in the class to make sure I have my facts straight, several noted that in examples like (7) and (9), they would be more likely to use a comma in between the clauses and a lower-case “slash.”)

The innovative uses of slash don’t stop there either: some students are also using slash to introduce an afterthought that is also a topic shift, captured in this sample text from a student:

12. JUST SAW ALEX! Slash I just chubbed on oatmeal raisin cookies at north quad and i miss you

Slash: Not Just a Punctuation Mark Anymore [Anne Curzan/Chronicle of Higher Education]

(via Making Light)

When trademark becomes a tool for stealing our language

My latest Guardian column is "Trademarks: the good, the bad and the ugly," and it looks at why trademark, at its best, does something vital -- but how trademark can be abused to steal common words from our language and turn them into a twisted kind of pseudo-property.

Trademark lawyers have convinced their clients that they must pay to send a threatening notice to everyone who uses a trademark without permission, even where there is no chance of confusion. They send letters by the lorryload to journalists, website operators, signmakers, schools, dictionary publishers – anyone who might use their marks in a way that weakens the association in the public mind. But weakening an association is not illegal, despite the expansion of doctrines such as "dilution" and "naked licensing."

When called out on policing our language, trademark holders and their lawyers usually shrug their shoulders and say, "Nothing to do with us. The law requires us to threaten you, or we lose our association, and thus our mark." This is a very perverse way of understanding trademark.

The law is there to protect the public interest, and the public interest isn't undermined by the strength or weakness of an association with a specific word or mark with a specific company. The public interest extends to preventing fraud, and trademark uses the motivation of protecting profits to incentivise firms to uphold the public interest.

Trademarks: the good, the bad and the ugly

Automated constrained poetry, made from Markov Chains and Project Gutenberg

A "Snowball" is a poem "in which each line is a single word, and each successive word is one letter longer." Nossidge built an automated Snowball generator that uses Markov Chains, pulling text from Project Gutenberg. It's written in C++, with code on GitHub. The results are rather beautiful poems (these ones are "mostly Dickens"):

o
we
all
have
heard
people
believe
anything

i
am
the
dawn
light
before
anybody
expected
something
disorderly

i
am
the
very
great
change

Snowball (also called a Chaterism) (via Waxy)

Canada Post claims exclusive use of the words "postal code"

Canada Post -- a failing, state-owned Crown Corporation -- not only claims a copyright on the database of postal codes (a collection of facts, and not the sort of thing that usually attracts copyright). They also claim a trademark on the words "postal code," and have sent legal threats to websites that use the words factually, to describe actual postal codes.

Canada Post disagrees. The crown corporation now argues that the very term “postal code” is subject to a trademark owned by Canada Post. Anyone using the term “postal code,” therefore, does so at their own risk.

“Canada Post has adopted and used Canadian Official Mark POSTAL CODE,” the statement of claim reads. “The Defendants have passed off their wares and services as and for those of Canada Post contrary to section 7(c) of the Trade-marks Act.”

What this means is Canada Post is changing direction in their lawsuit against Geolytica.

Geolytica has argued since the lawsuit began that they did not copy the Canada Post postal code database, but instead built their own based on the feedback of their own users. They crowd-sourced it. This makes Canada Post’s original copyright claim trickier, even if you set aside the facts vs. intellectual property argument.

Canada Post says they hold trademark on the words ‘postal code’

Early American tombstone euphemisms for death


In 2008, Caitlin GD Hopkins collected 101 euphemisms for "died" from early American epitaphs. The epitaphs came from tombstones pre-1825, to qualify, the euphemism had to appear in the main text of the tombstone ("Here lies Fred; born 1801, laid himself to rest 1824"), not in the verse below it ("He was a nice guy"). It's quite a list:

Part 1: Died
Part 2: Departed This Life
Part 3: Deceased
Part 4: Entred Apon an Eternal Sabbath of Rest
Part 5: Fell a Victim to an Untimely Disease
Part 6: Departed This Transitory Life
Part 7: Killed by the Fall of a Tree
Part 8: Left Us
Part 9: Obit
Part 10: Slain by the Enemy
Part 11: Departed This Stage of Existence
Part 12: Went Rejoycing Out of This World
Part 13: Submiting Her Self to ye Will of God
Part 14: Fell Asleep
Part 15: Changed a Fleeting World for an Immortal Rest
Part 16: Fell Asleep in the Cradle of Death
Part 17: Fell Aslep in Jesus
Part 18: Was Still Born
Part 19: Innocently Retired
Part 20: Expired
Part 21: Perished in a Storm
Part 22: Departed from This in Hope of a Better Life
Part 23: Summoned to Appear Before His Judge
Part 24: Liv'd About 2 Hours
Part 25: Rose Upon the Horizon of Perfect Endless Day

All 101 of them are linked to photos of the headstones in the actual post:

101 Ways to Say "Died" (via Making Light)

Yet another reason why jargon sucks

Yes, it's useful for communicating within your group, but as soon as you step outside that circle jargon becomes a problem. That's true even for scientists trying to communicate between disciplines and sub-disciplines of a field. At Ars Technica, John Timmer talks about jargon acronyms that look the same, but mean totally different things depending on what science you do. One of his examples: CTL. If you study flies, this can refer to a specific gene. For people who work with mice, it's a reference to curly tails. For immunologists, it's a type of white blood cell — cytotoxic T lymphocyte.