The mystical Kabbalah roots of natural language processing

With Siri and Alexa, the computer science of natural language processing (NLP) is finally ready for prime time. In IEEE Spectrum, Oscar Schwartz wrote a fascinating essay linking NLP, "linguistic interactions between humans and machines," with 13th century Jewish mysticism. I've always enjoyed smart writing that pulls threads between technology and occult practices, and Schwartz's short piece is a fine example of that. From IEEE Spectrum:

In the late 1200s, a Jewish mystic by the name of Abraham Abulafia sat down at a table in his small house in Barcelona, picked up a quill, dipped it in ink, and began combining the letters of the Hebrew alphabet in strange and seemingly random ways. Aleph with Bet, Bet with Gimmel, Gimmel with Aleph and Bet, and so on.

Abulafia called this practice “the science of the combination of letters.” He wasn’t actually combining letters at random; instead he was carefully following a secret set of rules that he had devised while studying an ancient Kabbalistic text called the Sefer Yetsirah. This book describes how God created “all that is formed and all that is spoken” by combining Hebrew letters according to sacred formulas. In one section, God exhausts all possible two-letter combinations of the 22 Hebrew letters.

By studying the Sefer Yetsirah, Abulafia gained the insight that linguistic symbols can be manipulated with formal rules in order to create new, interesting, insightful sentences. To this end, he spent months generating thousands of combinations of the 22 letters of the Hebrew alphabet and eventually emerged with a series of books that he claimed were endowed with prophetic wisdom.

Read the rest

Train your AI with the world's largest data-set of sarcasm, courtesy of redditors' self-tagging

Redditors' convention of tagging their sarcastic remarks is a dream come true for machine learning researchers hoping to teach computers to recognize and/or generate sarcasm. Read the rest

Analyzing all known Metal lyrics with natural language processing

Iain ("an ex-physicist currently working as a data scientist") scraped Dark Lyrics and built a dataset of lyrics to 222,623 songs by 7,364 metal bands, then used traditional natural language processing techniques to analyze them. Read the rest

GCHQ's dirty-tricking psyops groups: infiltrating, disrupting and discrediting political and protest groups

In a piece on the new Omidyar-funded news-site "The Intercept," Glenn Greenwald pulls together the recent Snowden leaks about the NSA's psyops programs, through which they sought to attack, undermine, and dirty-trick participants in Anonymous and Occupy. The new leaks describe the NSA' GCHQs use of "false flag" operations (undertaking malicious actions and making it look like the work of a group they wish to discredit), the application of "social science" to disrupting and steering online activist discussions, luring targets into compromising sexual situations, deploying malicious software, and posting lies about targets in order to discredit them.

As Greenwald points out, the unit that conducted these actions, "Jtrig" (Joint Threat Research Intelligence Group), does not limit itself to attacking terrorists -- it explicitly targets protest groups, and political groups that have no connection with national security, including garden-variety criminals who are properly the purview of law enforcement agencies, not intelligence agencies.

The UK spy agency GCHQ operates a programme, called the "Human Science Operations Cell," whose remit is "strategic influence and disruption."

Some of the slides suggest pretty dubious "social science" (see below) -- they read like a mix between NLP hucksters and desperate Pick Up Artist losers. Read the rest