Robin Sloan is a programmer and novelist whose books like Sourdough and Mr Penumbra's 24-Hour Bookstore are rich and evocative blends of self-aware nerdy playfulness and magical speculation. Read the rest
In a new Columbia Law and Economics Working Paper, Columbia Law prof Joshua Mitts uses "stylometry" (previously) to track how market manipulators who publish false information about companies in order to profit from options are able to flush their old identities when they become notorious for misinformation and reboot them under new handles.
Read the rest
Anonymous stock-market manipulators behind $20B+ of "mispricing" can be tracked by their writing styles
In a new Columbia Law and Economics Working Paper, Columbia Law prof Joshua Mitts uses "stylometry" (previously) to track how market manipulators who publish false information about companies in order to profit from options are able to flush their old identities when they become notorious for misinformation and reboot them under new handles. Read the rest
A presentation today at Defcon from Drexel computer science prof Rachel Greenstadt and GWU computer sicence prof Aylin Caliskan builds on the pair's earlier work in identifying the authors of software and shows that they can, with a high degree of accuracy, identify the anonymous author of software, whether in source-code or binary form. Read the rest
In a newly revised paper in Computer Vision and Pattern Recognition, a group of French and Swiss computer science researchers show that "a very small perturbation vector that causes natural images to be misclassified with high probability" -- that is, a minor image transformation can beat machine learning systems nearly every time. Read the rest
In the immediate aftermath of the Trump administration's gag orders on government employees disclosing taxpayer-funded research results, a series of high-profile "rogue" government agency accounts popped up on Twitter, purporting to be managed by civil servants who are unwilling to abide by the gag order. Read the rest
Michael from Muckrock writes, "When MuckRock stumbled on I Write Like - a service that lets you see which famous author a given piece of writing resembles - they immediately knew what it was destined for: Helping shed light on on the literary influences of the mysterious FOIA offices they deal with on a daily basis. Fittingly, some offices echo HP Lovecraft's dark horror, while others are more Dan Brown. But you'll never guess which agency seems to take a cue from Cory Doctorow ..." Read the rest
When Enron collapsed and got hit with a lawsuit requesting discovery on its internal email, its top bosses decided that they'd skip spending money on pricey lawyers to go through the archive and remove immaterial messages -- instead, the dumped the entire corpus of internal mail, including their employees' personal messages. Read the rest
One of the most interesting technical presentations I attended in 2012 was the talk on "adversarial stylometry" given by a Drexel College research team at the 28C3 conference in Berlin. "Stylometry" is the practice of trying to ascribe authorship to an anonymous text by analyzing its writing style; "adversarial stylometry" is the practice of resisting stylometric de-anonymization by using software to remove distinctive characteristics and voice from a text.
Stanford's Arvind Narayanan describes a paper he co-authored on stylometry that has been accepted for the IEEE Symposium on Security and Privacy 2012. In On the Feasibility of Internet-Scale Author Identification (PDF) Narayanan and co-authors show that they can use stylometry to improve the reliability of de-anonymizing blog posts drawn from a large and diverse data-set, using a method that scales well. However, the experimental set was not "adversarial" -- that is, the authors took no countermeasures to disguise their authorship. It would be interesting to see how the approach described in the paper performs against texts that are deliberately anonymized, with and without computer assistance. The summary cites another paper by someone who found that even unaided efforts to disguise one's style makes stylometric analysis much less effective.
Read the rest
We made several innovations that allowed us to achieve the accuracy levels that we did. First, contrary to some previous authors who hypothesized that only relatively straightforward “lazy” classifiers work for this type of problem, we were able to avoid various pitfalls and use more high-powered machinery. Second, we developed new techniques for confidence estimation, including a measure very similar to “eccentricity” used in the Netflix paper.
Today at the Chaos Computer Congress in Berlin (28C3), Sadia Afroz and Michael Brennan presented a talk called "Deceiving Authorship Detection," about research from Drexel College on "Adversarial Stylometry," the practice of identifying the authors of texts who don't want to be identified, and the process of evading detection. Stylometry has made great and well-publicized advances in recent years (and it made the news with scandals like "Gay Girl in Damascus"), but typically this has been against authors who have not taken active, computer-assisted countermeasures at disguising their distinctive "voice" in prose.
As part of the presentation, the Drexel Team released Anonymouth, a free/open tool that partially automates the process of evading authorship detection. The tool is still a rough alpha, and it requires human intervention to oversee the texts it produces, but it is still an exciting move in adversarial stylometry tools. Accompanying the release are large corpuses of test data of deceptive and non-deceptive texts.
Stylometry has been cited by knowledgeable critics as proof of the pointlessness of the Nym Wars: why argue for the right to be anonymous or pseudonymous on Google Plus or Facebook when stylometry will de-anonymize you anyway? I've been suspect of these critiques because they assume that only de-anonymizers will have access to computer-assisted tools, but as Anonymouth shows, there are many opportunities to use automation tools to improve anonymity.
Stylometry matters in many ways: its state of the art changes the balance of power between trolls and moderators, between dissidents and dictators, between employers and whistleblowers, between astroturfers and commenters, and between spammers and filters. Read the rest