In a new Columbia Law and Economics Working Paper, Columbia Law prof Joshua Mitts uses "stylometry" (previously) to track how market manipulators who publish false information about companies in order to profit from options are able to flush their old identities when they become notorious for misinformation and reboot them under new handles.
Read the rest
A presentation today at Defcon from Drexel computer science prof Rachel Greenstadt and GWU computer sicence prof Aylin Caliskan builds on the pair's earlier work in identifying the authors of software and shows that they can, with a high degree of accuracy, identify the anonymous author of software, whether in source-code or binary form.
Read the rest
Michael from Muckrock writes, "When MuckRock stumbled on I Write Like - a service that lets you see which famous author a given piece of writing resembles - they immediately knew what it was destined for: Helping shed light on on the literary influences of the mysterious FOIA offices they deal with on a daily basis. Fittingly, some offices echo HP Lovecraft's dark horror, while others are more Dan Brown. But you'll never guess which agency seems to take a cue from Cory Doctorow ..." Read the rest
When Enron collapsed and got hit with a lawsuit requesting discovery on its internal email, its top bosses decided that they'd skip spending money on pricey lawyers to go through the archive and remove immaterial messages -- instead, the dumped the entire corpus of internal mail, including their employees' personal messages. Read the rest
Something Awful celebrates the deathless prose of Thomas Friedman and the mountains of empty calories on offer at the International House of Pancakes -- Friedman's culinary equivalent -- by giving us notional menu copy as written by the Great Flat One. Read the rest
One of the most interesting technical presentations I attended in 2012 was the talk on "adversarial stylometry" given by a Drexel College research team at the 28C3 conference in Berlin. "Stylometry" is the practice of trying to ascribe authorship to an anonymous text by analyzing its writing style; "adversarial stylometry" is the practice of resisting stylometric de-anonymization by using software to remove distinctive characteristics and voice from a text.
Stanford's Arvind Narayanan describes a paper he co-authored on stylometry that has been accepted for the IEEE Symposium on Security and Privacy 2012. In On the Feasibility of Internet-Scale Author Identification (PDF) Narayanan and co-authors show that they can use stylometry to improve the reliability of de-anonymizing blog posts drawn from a large and diverse data-set, using a method that scales well. However, the experimental set was not "adversarial" -- that is, the authors took no countermeasures to disguise their authorship. It would be interesting to see how the approach described in the paper performs against texts that are deliberately anonymized, with and without computer assistance. The summary cites another paper by someone who found that even unaided efforts to disguise one's style makes stylometric analysis much less effective.
Read the rest
We made several innovations that allowed us to achieve the accuracy levels that we did. First, contrary to some previous authors who hypothesized that only relatively straightforward “lazy” classifiers work for this type of problem, we were able to avoid various pitfalls and use more high-powered machinery. Second, we developed new techniques for confidence estimation, including a measure very similar to “eccentricity” used in the Netflix paper.