Apple's Siri vs. Japanese-accented English

In this video, an increasingly frustrated native Japanese speaker discovers that Siri is unable to parse the spoken English word "work" when voiced with a typical Japanese accent. (kenjikinukawa via Joi Ito)


  1. The inverse of this happened to me when I was playing Brain Age on the Nintendo DS (a few years back).  In one of the games, the name of a color is flashed on the screen, and you need to quickly say the color of the lettering (not what the word says).  It was frustrating because I kept getting false negatives on certain words, until my friend advised me to use Japanese-accented English on words like “yellow.”  Suddenly, I was scoring much higher, and much borderline politically incorrect mirth ensued.

  2. My high school typing teacher told me I should be taking the class more seriously. I laughed and said to her, “We’ll all be talking to our computers in two years.” That was 1988.

    1. I had a CompSci teacher in college preach to us about how keyboards were going to be useless in 10 years because all software will be voice commanded. So I asked “what if you’re in a noisy subway or don’t want everybody to know what you’re doing?” He got so mad his face turned red and I kept getting bad marks on my assignments, but it was totally worth it.

  3. The really awful part of this is that siri is asking a multiple choice question- all she needs is to distinguish walk/fuck/wall from home. I hope in future versions they won’t make the speech recognition fuck any harder than it has to.

  4. You’d think with a binary choice like this it could best-fit to one of the two choices. For the most part it correctly identifies the /w/ and /k/, that’s more than enough to distinguish between “work” and “home.”

  5. Given the massive pronunciation differences between Japanese speech and English, I’m not surprised in the slightest that a speech system designed for English vocal patterns and phonemes is going to fail with such an accent. Meanwhile, this video is from about a year ago – while I don’t expect Siri to fare any better today, it’s also not showing the current state of things.

          1.  You’d be surprised.  I teach a variety of non-native Japanese speakers, and their accents in Japanese are pretty atrocious.

          2. I had several Japanese roommates in the US.  Once you’ve spent five years listening to English in a thick Japanese accent, it makes it quite easy to replicate it.

          3. Almost 10 years studying Japanese, 3 of which were spend living and working in Japan, and I still have difficulty pronouncing ‘myo’ (as in the town, Myodani) vs ‘miyo.’ I can hear how I’m wrong, but I just can’t produce the right sounds when I’m talking quickly.

    1. I’m sure that, if I was to whip-up a half-broken piece of beta software, I wouldn’t market it as the most important new feature of my new phone. Especially if I was a company that tried to sell myself as making things that “just work”.

  6. I still find it odd that Apple has put so much focus on a feature that is ultimately a just a gimmick for most people, especially at it’s current level of functionality, while so badly failing at something so heavily relied-upon as mapping.

    1.  Because they’re new to the cartography business. Doesn’t anyone remember tumblrs and such that were devoted to Google Maps errors, not so long ago?

      1. If you use Siri to search for information, you’re less likely to need to go to Google to search for things. Over time Siri could become your Search Engine of choice, and Apple could start to charge for sponsored results and usurp Google Ads. Even if Apple doesn’t charge, they’ve still dealt a huge blow to Google by reducing Google’s traffic. The prize for getting voice activated search right is huge for both Apple and Google.

  7. I know this was designed as an object lesson, but a “real” user would have just tapped “work” on the screen and gotten on with his life.
    Worth noting that it had no problem recognizing that he wanted to send an email, and I could barely understand that query myself.

  8. I find it interesting that anyone would see the problem as being more about Siri’s poor speech recognition rather than the guy’s poor English pronunciation. I totally agree that Siri should be sophisticated enough to pick the closest-sounding word between the two given choices… but I also can’t help thinking that the guy thoroughly squandered the six years of compulsory English lessons he (presumably) had in junior-high and high school… As an English teacher in Japan I try my darnedest to get my students out of thinking, “Oh well, katakana English is close enough”… because as this video demonstrates, sometimes it just isn’t… ;-)

Comments are closed.