The "universal adversarial preturbation" undetectably alters images so AI can't recognize them

Cory Doctorow

9:28 am Wed, Mar 29, 2017

In a newly revised paper in Computer Vision and Pattern Recognition, a group of French and Swiss computer science researchers show that "a very small perturbation vector that causes natural images to be misclassified with high probability" — that is, a minor image transformation can beat machine learning systems nearly every time.

What's more, the researchers show evidence that similar tiny distortions exist in other kinds of data-types that confound all machine-learning systems! The "universal adversarial preturbation" changes images in ways that are imperceptible to the human eye, but devastating to machine vision.

The research is typical of the early phases of computer science breakthroughs that demonstrate incredible results in classifying systems that no one is trying to game. Think of when Google figured out that the links between webpages — only ever made as an expression of interest — could be used to classify the importance of web-pages, only to kick of an arms race as attackers of the system figured out that they could get high rankings with easy-to-maintain linkfarms. The same thing happened when stylometry begat adversarial stylometry.

All of success in using deep learning classification to identify people, or catch cheaters, has assumed that the other side never deploys any countermeasures. This paper suggests that such countermeasures are trivial to generate and devastating in practice.

* We show the existence of universal image-agnostic
perturbations for state-of-the-art deep neural networks.
• We propose an algorithm for finding such perturbations.
The algorithm seeks a universal perturbation for
a set of training points, and proceeds by aggregating
atomic perturbation vectors that send successive datapoints
to the decision boundary of the classifier.

• We show that universal perturbations have a remarkable
generalization property, as perturbations computed
for a rather small set of training points fool new
images with high probability.
• We show that such perturbations are not only universal
across images, but also generalize well across deep
neural networks. Such perturbations are therefore doubly
universal, both with respect to the data and the network
architectures.
• We explain and analyze the high vulnerability of deep
neural networks to universal perturbations by examining
the geometric correlation between different parts
of the decision boundary.
Universal adversarial perturbations
[Seyed-Mohsen Moosavi-Dezfooli, Alhussein Fawzi, Omar Fawzi and Pascal Frossard/ Computer Vision and Pattern Recognition]

(via JWZ)