Machine learning systems are pretty good at finding hidden correlations in data and using them to infer potentially compromising information about the people who generate that data: for example, researchers fed an ML system a bunch of Google Play reviews by reviewers whose locations were explicitly given in their Google Plus reviews; based on this, the model was able to predict the locations of other Google Play reviewers with about 44% accuracy.
For several years, I've been covering the bizarre phenomenon of "adversarial examples (AKA "adversarial preturbations"), these being often tiny changes to data than can cause machine-learning classifiers to totally misfire: imperceptible squeaks that make speech-to-text systems hallucinate phantom voices; or tiny shifts to a 3D image of a helicopter that makes image-classifiers hallucinate a rifle
Machine learning systems trained for object recognition deploy a bunch of evolved shortcuts to choose which parts of an image are important to their classifiers and which ones can be safely ignored.
In Partial Information Attacks on Real-world AI, a group of MIT computer science researchers report on their continuing work fooling Google's image-classifier, this time without any knowledge of how the classifier works.
Machine-learning-based image classifiers are vulnerable to "adversarial preturbations" where small, seemingly innocuous modifications to images (including very trivial ones) can totally confound them.