Zappos hires Mechanical Turk to proofread product reviews

After concluding that well-written, well-punctuated, grammatical reviews increased sales, Zappos hired the Amazon Mechanical Turk to proofread the reviews on its site, correcting errors without changing the content of the reviews. They claim to have seen a "substantial" revenue increase as a result.
An online retailer noticed that indeed products with high-quality reviews are selling well. So, they decided to take action. They used Amazon Mechanical Turk to improve the quality of its reviews. Using the Find-Fix-Verify pattern, they used Mechanical Turk to examine a few millions of product reviews. (Here are the archived versions of the HITs: Find, Fix, Verify... and if you have not figured out the firm name by now, it is Zappos :-) ) For the reviews with mistakes, they fixed the spelling and grammar errors! Thus they effectively improved the quality of the reviews on their website. And, correspondingly, they improved the demand for their products!

While I do not know the exact revenue improvement, I was told that it was substantial. Given that the e-tailer spent at least 10 cents per review, and that they examined approximately 5 million reviews, this is an expense of a few hundred thousand dollars. (My archive on MTurk-Tracker kind of confirms these numbers.) So, the expected revenue improvement should have been at least a few million dollars!

An ingenious application of crowdsourcing: Fix reviews' grammar, improve sales (via O'Reilly Radar)


  1. I’m often suspicious of well-written, grammatically correct product reviews. I usually assume they are fake (placed by the company or PR firm), and the ones that are imperfect are more likely to be real. But then, I immediately discount “OMG bests book EVAH!!!!1!!,” so maybe there’s a middle ground — just imperfect enough to be believable.

    I think, though, that the Mechanical Turk service is more likely to generate fake-sounding reviews, even if the original sentiment was genuine, as every gets homogenized down to Strunk-and-White. I suppose that’s better than just blatantly creating fake reviews and social media content, which I had previously thought was the primary purpose of Mechanical Turk.

  2. I thought Mechanical Turk was a den of scams and shoddy work. Who at Zappos verified the quality of the Turk’s contributions? Seems like a nightmare to QC.

  3. Now if they could just put properly sized shoes on the feet of their models in the demo videos they’d have sales through the roof!

  4. Is anyone else bothered by the atrocious grammar (or at least awkward constructions) of the quoted article? “A few millions of” indeed.

    1. Not only that; I was also annoyed by the way both grafs ended in exclamation marks!

      You’d think a competent writer would know at least one more effective way of adding emphasis to an idea!

      Because, when you get right down to it, exclamation marks make writing look juvenile!

  5. I would assume there’s some sort of obscure legalize on the Zappos site stating that all comments and reviews are solely owned by Zappos Inc – therefore Zappos can do what they want with them. I would be suspicious however if Zappos starts disemvoweling comments in the hope of improving sales : )

  6. Sounds like a light form of fraud to me. One of the ways we judge the legitimacy of reviews is by the grammar and punctuation. Perhaps Zappos would offer their Mechanical Turk proof readers to Nigerian letter scammers as well? And maybe witnesses in civil and criminal cases could hire actors to give their testimony? And…well, you get the idea.

  7. i have to agree w/ Skep it seems a bit fraudulent. I want to read reviews as they are written, not as they are re-written by piece-work copywriters. The original writing gives additional insight into the quality and source of the review, which should be taken into account. Nope, don’t like it at all… although I am not surprised to hear of this happening.

  8. I also agree with Skep and DJBudSonic. It’s strange that they don’t consider the errors part of the “content.” I absolutely pay attention to how well a review is written when shopping. The whole point of customer reviews is that they are unfiltered information about the product. The source of the review is part of the content of the review. If they are corrected, they are not unfiltered and pretty much all information about the source is removed. Leaving me with way to discern whether the reviewer is an idiot or not (which is what it unfortunately amounts to). As far as I’m concerned, those reviews are no longer customer reviews, they’re now Zappos’ ad copy.

  9. I like my comments free range, grain fed and organically rife with grammatical error. Those pedants need to keep their mechanical processing far away from my reviews.

  10. There are a few angles here one needs to pursue:

    1) Sorry to say, as much as I like Zappos the vast majority of reviews seem fake no matter what product you look at. All filled with the same inordinate enthusiasm towards shoes most normal folks don’t have. For example, look at similar reviews for similar products on Amazon; on Amazon there are clearly humans writing and giving proper praise/warnings. Not so on Zappos.

    2) I tried out Amazon Mechanical Turk a few weeks ago and was 100% horrified. Not because of the scams—there are a few—but the tons of legit stuff. I participated in a few “HITS” to see myself. And my horror comes from the amount of REAL work folks demand for what amounts to slave wages. Seriously, proofreading used to be a very well paid and respected career. It still is and in many ways an entryway into other forms of writing. But I don’t see how doing work on the Amazon Mechanical Turk leads to anything. The stuff they offer is basically the high-tech equivalent of piece-work! At least if you proofread for a real publication doors open for better things; what “better thing” comes from the Amazon Mechanical Turk.

    3) A good chunk of reviews on sites like Zappos, Yelp and Amazon are 100% pure shillery. Nothing more and nothing less. I can give a few Yelp examples of local restaurants friends and colleagues have panned, but within a WEEK of opening, dozens of very “shilly” reviews pop up on Yelp praising the place.

    This is all to say that I appreciate the underlying concept presented of well written material being treated with more respect. But I think the fact some “invisible hand” needs to pop up to clean the stuff up invalidates the concept of a personal review.

    It all horrifies me because we live in an era where the average consumer has more and more ways of expressing themselves… But those channels seem to be undermined by shills, plants, sock-puppets and marketing departments.

  11. I would have the same issues about a full rewriting of a review, but I’m not sure how a misspelled word gives us insight into the quality of that review. Is someone more or less qualified to tell us that a certain shoe fit comfortably because they were a poor speller? In this case, the outcome might possibly be a more informative service — you get the most information about the product without ancillary cues that lead you to make decisions with the less relevant (to the shoes) information.

    1. I’m not sure how a misspelled word gives us insight into the quality of that review.

      Spend a couple of weeks working as a moderator. 90% of spam comes from South, Central and East Asia, and language clues are a huge part of spam detection. If a large number of product reviews contain language errors that are consistent with ESL writers, or even consistent with one single writer, it’s a good bet that you’re looking at paid reviews.

      1. You are absolutely correct. A great professor is able to look at one of his students papers and, based on past assignments, will recognize an inconsistent pattern and raise questions. I see the same with product reviews, I know they are written by the same person or by shills. The language used is to exacting and most honest reviews usually point to one fact concerning the product. Sad fact is, most reviews are bad, as in negative. Online sellers need to discover how to entice customers into submitting positive reviews. Also, the companies that recruit on-line reviewers need to be outed.

        Great post!

        1. When I type, I always accidentally capitalize an extra letter and type a semi-colon instead of an apostrophe. I clean it up before submitting, which a paid reviewer is unlikely to do. If I see these three reviews…

          Betty: OH my god, I;m in love with these shoes!

          Veronica: THese are the most comfortable shoes I;ve ever had!

          Miss Grundy: WOw! I can;t wait for my new shoes to arrive!

          …I’m pretty sure that’s one person with three identities.

  12. Zappos seems to sell mostly expensive shoes. Very expensive shoes. Maybe they have increased profits by jacking up their prices or only offering high markup products?

    1. Have you actually shopped at Zappos or looked at their site? You can easily get cheap sneakers and such from them as well as high-end stuff. In fact the main reason I buy from them is to get cheaper sneakers than I can get locally.

  13. I, for one, think this is a good thing. While it opens the doors for editing the content of reviews, I’m sure there are plenty of sites that already do that. At the very least, if I read a set of reviews looking for real information, I will not need to spend nearly as much time trying to parse the semi-literate sms-speak of J. Random Reviewer. However, it might be more ethically sound to merely hire Mechanical Turk to mark poorly written reviews for deletion. I don’t trust the information-gathering ability of someone who can’t spell ‘you’, and fixing their spelling does nothing to fix the various other mental deficiencies that may be indicated by poor spelling.

  14. I don’t see why everyone thinks this is “fraudulent” or has misgivings over it. Letters to the Editors in newspapers get “edited for grammar and clarity,” whether or not the letter is praising the newspaper or criticizing it. And the newspaper still publishes them under the original author’s name.

    So long as Zappos says that it may edit the reviews for grammar, I think that’s fine. If I leave a nasty review, I can still go back and check if it’s there, and, if not, start raising holy hell on the blogs and cause Zappos negative press.

    Every site owner still has the ability to fake reviews on their own site, if they like, and edit out the bad ones. That has absolutely nothing to do with Mechanical Turk. A company isn’t going to advertise about it’s great new filtering process it’s doing on reviews if the purpose of the filtering is fraudulent.

    Finally, as everyone since Amazon discovered decades ago, you sell more products if you keep negative, honest reviews, and buyers feel like they can trust the content and find the best product for them. Remember, these aren’t reviews of Zappos, they are reviews of individual shoes that Zappos retails. If users see a negative review, they go and find a better shoe on Zappos. It’s just like seeing a negative review of a book on Amazon.

  15. Lately; I visit BB less. I notice many typing errors. Still, too many writers rely on Spell Check. Take this email I wrote while high on Ambien; I simply auto-corrected misspelled words.

    Ever day this is the first site I visit. It takes you to a random place in the whorl. They call it “view on the bay”. I life it because you are not bound my travel guides and cities gleaned up to apple to tourists. The vocation changes daily and the site is easy to us, it’s called a point-and-lick. Today you can “visit” Hualien County, Taiwan. I thing operating one of these cameras would me by dream jog.

    (When I read the email it made sense to me,it still does. My brain compensated for the confusion and yours will to, whether you know it or not.)

    After my dad sent it back I realized what had happened and was able to read it again, Ambien free:

    Every day this is the first site I visit. It takes you to a random place in the world. They call it “view of the day”. I like it because you are not bound by travel guides and cities cleaned up to appeal to tourists. The location changes daily and the site is easy to use, it’s called a point-and-click. Today you can “visit” Hualien County, Taiwan. I think operating one of these cameras would be my dream job.

    Prefixes and suffixes are mixed up. Reloading is typed as preloaded

  16. Even this BB post;

    They claim to have seen a “substantial” revenue increase as a result.

    As a result, they claim to have seen a “substantial” increase in revenue.

    We could go on for hours on which is correct. “Substantial” is the adjective. What can be argued is the noun it supports. The word “increase” can be a noun or a verb. “Revenue” is only a noun. Since increase can be one or the other, it should be forced. By writing it the second way this can be done and the words role established.

    1. I would even kill, “as a result.” But I’m a hack-n-slash editor of the Zinsser School.

      And I’m not afraid to start a sentence with a conjunction.

  17. While I think there’s a valid point to be made in terms of the fact that fake reviews look fake, and that this is helpful in determining whether to trust them, I sincerely doubt that most of the people on Mechanical Turk are great proofreaders.

    I play Poupee Girl. There are a lot of people who use Mechanical Turk to buy game currency even though you would actually get more “jewels” by earning the money almost any other way, because they feel that they’re not spending their “real money” on “pixels”. I love these (mostly) girls, but they’re not what I would call copyeditors; a quick glance at their posts on Poupee tells you that their English skills are a step up from some of the folks who do product reviews on Wal*Mart, but none of them has got Strunk & White memorised. :)

    I don’t buy shoes at Zappos a lot because their prices are really high (Jack, you should try,, Shoe Trader or even Amazon itself when they’re having a sale) due to the fact that they offer free shipping on returns both ways and they have to pay for it somehow. Their customer service is awesome, and sometimes they have colourways I can’t find elsewhere, but by and large, it’s not worth it to shop there if you know your size in a given brand. (Being a shoe fiend I usually do.)

    So when I’m looking at reviews, I look for details about the product, and I mainly pay attention to negative reviews. (I will still buy products with negative reviews. In fact I often do. What I look for in negative reviews is why other people didn’t like the product–whether it’s poor workmanship, bad materials or different priorities, or even something that is a negative for another person and a positive for me! If I can tell from the negative review that a pair of pants is big in the bum and tiny in the waist, and that this was a problem for someone with a straighter build, those pants are probably great for me! Similarly, if shoes are too wide for someone else, I’m liable to like them.

    I don’t think most people even read reviews–most people buy stuff that other people they know like or that some celebrity likes. (I once had a phone that I bought because it looked like a magenta ST:TOS communicator with blue trim and was regaled for months with questions about whether or not I had bought it because Paris Hilton had one. Ugh.)

    If that kind of information is missing from reviews they’re useless, and that kind of information is REALLY hard to fake. Positive reviews that describe a customer’s expectations and how they were exceeded are also useful, but they’re hard to fake, too, because they also depend upon actual familiarity with the product. I bought a foam mattress due to a positive review which described the process of unpacking and setting it up so clearly (and then gave the reasons why the reviewer liked the mattress) that I knew I would also like it.

    So count me in on the side of not particularly caring whether the reviews are edited for SPAG. I’d be all for it, in fact, were it not for the labour issue and the fact that I used to work as a proofreader for a small press and I wish such jobs still existed.

    It hurts my eyes to read bad grammar, and if I was looking through a paper catalogue you can be sure the reviews in there would be edited for SPAG (and also uniformly positive). I expect websites, like paper catalogues, to try to sell me a product, which is why I give less weight to positive reviews :)

    If Zappos is benefitting from proofed reviews it’s probably due to some factor that affects impulse buyers, because I know that I will often look at shoes on Zappos, then see if I can’t find them cheaper somewhere else…and I usually can.

  18. It would be interesting to use a Markov chain algorithm to create “random” sentences from combinations of existing reviews, and then have Mechanical Turk people correct them. How long before what you get is actually believable I wonder.

    Here’s a collection of comments from this thread “remixed” via a Markov chain algorithm…

    Great professor is able to look at one of his students papers get “edited for what amounts to slave wages. Seriously, proofreading used to be outed.

    I tried out Amazon Mechanical Turk to make decisions with the less relevant (to the same inordinate enthusiasm towards shoes that a certain shoe fit comfortably because of they offer is praise/warning the less relevant (to the quality of someone who can’t spelling. I don’t have. For example, look at. All filled with the less relevant (to the tons of legit stuff. I participated in a few “HITS” to say, as much as I like Zappos negative review, they are absolutely nothing the product you keep negative reviews, I know they are written by the same with product you look at. All filled with Mechanical Turk. A companies that review of a book on the Amazon Mechanical Turk a few weeks ago and find the newspaper or criticizing it. Letters to the ability to fake no matter what “better thinks this is “fraudulent.

    Finally, as everyone since Amazon there are plenty of that may be indicated by poor spelling. I don’t have. For example, look at one of his students papers and, based on past assignments, will not need to discover how to entice customers into other forms of writing ability of someone more or less relevant (to the shoes) informative service — you get the same inordinate enthusiasm towards shoes that may edit the reviews for deletion. I don’t spell ‘you’, and fixing the newspapers get “edited for grammar and clarity,” whether or not the Amazon Mechanical Turk.

  19. I do understand how spam reviews could be an issue, but as someone else mentioned above, for something like zappos, a review that says “I loved this shoe” really doesn’t help much. Most of the time you’re looking for someone who explains how they have a certain kind of foot and this shoe did or did not work for them, or the materials were a certain way. Given that the site has free shipping both ways, for them to just have a lot of spam positive reviews could hurt them, in the sense that they have many more returns and an overall reputation for being unreliable.

    More mturk goodness, for those of you who haven’t heard much about it. Social scientists are starting to use it more recently to test some basic findings (more cognitively-oriented things, generally) and when those results are compared to in-person findings, they’re often very similar. The upside is you can collect data quickly, and it’s often more representative than what’s currently being collected on college campuses. Also, the going rate for a turker is <$2/hr, but you should consider that a lot of people (myself included) do this as a quick side thing while browsing the net, as the money goes straight into your amazon account. Others may be from areas of the world where that $2 is more meaningful.

  20. I’m just impressed that there’s anything on Mechanical Turk besides:

    1) Like my page!
    2) Expose all of your personal information so I know you liked my page!
    3) Okay, now I’ll give you $0.07!
    4) You know, I don’t think you did a great job, so I’m not going to pay you.

    I spent a few hours looking around, but I would never participate. Though, realistically, it’s the “like my page”/”vote for my entry” part that bothers me most. For example, I’d like to take a baseball bat to anyone who refused to pay for services rendered, but if they didn’t buy votes/likes, I wouldn’t mind letting them walk away from it.

  21. Certainly violating my freedom of speech, which includes the freedom to misspell and use bad grammer. What a dork, this Turk story writer is.

Comments are closed.