Research into the shittiness of voice assistants zeroed in on a problem that many people were all-too-aware of: the inability of these devices to recognize "accented" speech ("accented" in quotes because there is no one formally correct English, and the most widely spoken English variants, such as Indian English, fall into this "accented" category).
Speech recognition is one of the many categories of machine learning where racial and other forms of bias in training data produces bad models that produce bad outcomes for already marginalized people (see also: facial recognition systems that work well on white people but can't tell black people apart).
It's tempting to think of machine learning as a form of computer sorcery whose black boxes can't be understood or critiqued, but bias in training data is pretty much the original sin in research: sampling bias, AKA, "garbage in, garbage out."
As a Washington Post study found, these systems work great if you are a "white, highly educated, upper-middle-class Americans, probably from the West Coast" and they suck for everyone else.
As Molly Sauter has pointed out machine learning systems have two goals: 1. to make you do the thing you did before; and 2. make you behave like a nonexistent median human being based on its training data. Good computer systems adapt to suit their users, but bad ones force users to adapt to suit them — look for a future in which people with "accents" learn to code-switch to sound like white, middle-class west coast computer programmers in order to use the computers in their lives.
The Washington Post teamed up with two research groups to study the smart speakers' accent imbalance, testing thousands of voice commands dictated by more than 100 people across nearly 20 cities. The systems, they found, showed notable disparities in how people from different parts of the U.S. are understood.
People with Southern accents, for instance, were 3 percent less likely to get accurate responses from a Google Home device than those with Western accents. And Alexa understood Midwest accents 2 percent less than those from along the East Coast.
People with nonnative accents, however, faced the biggest setbacks. In one study that compared what Alexa thought it heard versus what the test group actually said, the system showed that speech from that group showed about 30 percent more inaccuracies.
Why some accents don't work on Alexa or Google Home [Drew Harwell/Washington Post]