Features Podcasts Family Video Comics Music Tech Science Books Film & TV Games ✚

Jill

How Twitter figures out the world with machine intelligence and Mechanical Turks

Cory Doctorow at 11:12 am Wed, Jan 9, 2013

— FEATURED —

Book Review

The Man Who Laughs: grotesque Victor Hugo potboiler was the basis for The Joker

Feature

Eurovision 2013: An American in London

Book Review

The Twelve-Fingered Boy - mesmerizing YA horror novel

— FOLLOW US —

Boing Boing is on Twitter and Facebook. Subscribe to our RSS feed or daily email.

 

— POLICIES —

Except where indicated, Boing Boing is licensed under a Creative Commons License permitting non-commercial sharing with attribution

 

— FONTS —

Tweet
Kindle

On Twitter's engineering blog, a fascinating description of how Twitter uses a blend of machine intelligence and Mechanical Turk tasks to figure out, in real time, what is going on in the world:

Before we delve into the details, here's an overview of how the system works.

  1. First, we monitor for which search queries are currently popular.
    Behind the scenes: we run a Storm topology that tracks statistics on search queries.
    For example, the query [Big Bird] may suddenly see a spike in searches from the US.

  2. As soon as we discover a new popular search query, we send it to our human evaluators, who are asked a variety of questions about the query.
    Behind the scenes: when the Storm topology detects that a query has reached sufficient popularity, it connects to a Thrift API that dispatches the query to Amazon's Mechanical Turk service, and then polls Mechanical Turk for a response.
    For example: as soon as we notice "Big Bird" spiking, we may ask judges on Mechanical Turk to categorize the query, or provide other information (e.g., whether there are likely to be interesting pictures of the query, or whether the query is about a person or an event) that helps us serve relevant Tweets and ads.

  3. Finally, after a response from an evaluator is received, we push the information to our backend systems, so that the next time a user searches for a query, our machine learning models will make use of the additional information. For example, suppose our evaluators tell us that [Big Bird] is related to politics; the next time someone performs this search, we know to surface ads by @barackobama or @mittromney, not ads about Dora the Explorer.

Improving Twitter search with real-time human computation (via Waxy)

I write books. My latest is a YA science fiction novel called Homeland (it's the sequel to Little Brother). More books: Rapture of the Nerds (a novel, with Charlie Stross); With a Little Help (short stories); and The Great Big Beautiful Tomorrow (novella and nonfic). I speak all over the place and I tweet and tumble, too.

MORE:  big data • computer science • mturk • twitter • web theory

More at Boing Boing

Eurovision 2013: An American in London

The technology that links taxonomy and Star Trek

  • http://twitter.com/fossilfuels Funk Daddy

    “For example, suppose our evaluators tell us that [Big Bird] is related to politics; the next time someone performs this search, we know to surface ads by @barackobama or @mittromney, not ads about Dora the Explorer.”

    Somewhere a 6-year old looking for Big Bird on twitter learned that Mitt Romney is a jerk.

    • http://www.kmoser.com kmoser

      http://www.youtube.com/watch?v=7ufkCzmR7PQ