There's something eerie about bots that teach themselves to cheat

One of the holy grails of computer science is unsupervised machine learning, where you tell an algorithm what goal you want it to attain, and give it some data to practice on, and the algorithm uses statistics to invent surprising ways of solving your problem.


Oftentimes, that means cheating. Because machine learning systems systematically probe their training environments for possible paths to victory, they often discover bugs, glitches and holes in the rules that they can exploit to attain their goals — and without human supervision, they have no way to know that their awesome exploit is actually against the rules and shouldn't be employed.

Wired's Tom Simonite rounds up a tidy little list of cheating algorithms, like the algorithm that scored high in a survival simulation by evolving a species that ate nothing but its own children. I love the list, but wish it came with links to the studies!

It's a really good reminder of why black-box algorithmic decision-making tools need to be auditable: if the algorithm says you're not a good risk for parole or a loan, you need to be able to know if it's cheating (say, by disqualifying all poor people) or whether it's fairly evaluating you.


Space War: Algorithms exploited flaws in the rules of the galactic videogame Elite Dangerous to invent powerful new weapons.

Body Hacking: A four-legged virtual robot was challenged to walk smoothly by balancing a ball on its back. Instead, it trapped the ball in a leg joint, then lurched along as before.

Goldilocks Electronics: Software evolved circuits to interpret electrical signals, but the design only worked at the temperature of the lab where the study took place.

Optical Illusion: Humans teaching a gripper to grasp a ball accidentally trained it to exploit the camera angle so that it appeared successful—even when not touching the ball.

When Bots Teach Themselves to Cheat [Tom Simonite/Wired]

(Image: Dennis Hill, Cryteria, CC-BY)