In a newly revised paper in Computer Vision and Pattern Recognition, a group of French and Swiss computer science researchers show that "a very small perturbation vector that causes natural images to be misclassified with high probability" — that is, a minor image transformation can beat machine learning systems nearly every time.
A team of researchers from Microsoft and Harvard's Berkman Center have published a taxonomy of "Failure Modes in Machine Learning," broken down into "Intentionally-Motivated Failures" and "Unintended Failures."
Machine learning systems are pretty good at finding hidden correlations in data and using them to infer potentially compromising information about the people who generate that data: for example, researchers fed an ML system a bunch of Google Play reviews by reviewers whose locations were explicitly given in their Google Plus reviews; based on this, the model was able to predict the locations of other Google Play reviewers with about 44% accuracy.
An adversarial preturbation is a small, human-imperceptible change to a piece of data that flummoxes an otherwise well-behaved machine learning classifier: for example, there's a really accurate ML model that guesses which full-sized image corresponds to a small thumbnail, but if you change just one pixel in the thumbnail, the classifier stops working almost entirely.
For several years, I've been covering the bizarre phenomenon of "adversarial examples (AKA "adversarial preturbations"), these being often tiny changes to data than can cause machine-learning classifiers to totally misfire: imperceptible squeaks that make speech-to-text systems hallucinate phantom voices; or tiny shifts to a 3D image of a helicopter that makes image-classifiers hallucinate a rifle
Adversarial examples have torn into the robustness of machine-vision systems: it turns out that changing even a single well-placed pixel can confound otherwise reliable classifiers, and with the right tricks they can be made to reliably misclassify one thing as another or fail to notice an object altogether. — Read the rest
Machine learning systems trained for object recognition deploy a bunch of evolved shortcuts to choose which parts of an image are important to their classifiers and which ones can be safely ignored.
In Partial Information Attacks on Real-world AI, a group of MIT computer science researchers report on their continuing work fooling Google's image-classifier, this time without any knowledge of how the classifier works.
Machine-learning-based image classifiers are vulnerable to "adversarial preturbations" where small, seemingly innocuous modifications to images (including very trivial ones) can totally confound them.
Three researchers from Kyushu University have published a paper describing a means of reliably fooling AI-based image classifiers with a single well-placed pixel.