On Twitter's engineering blog, a fascinating description of how Twitter uses a blend of machine intelligence and Mechanical Turk tasks to figure out, in real time, what is going on in the world:
Before we delve into the details, here's an overview of how the system works.
First, we monitor for which search queries are currently popular.
Behind the scenes: we run a Storm topology that tracks statistics on search queries.
For example, the query [Big Bird] may suddenly see a spike in searches from the US.
As soon as we discover a new popular search query, we send it to our human evaluators, who are asked a variety of questions about the query.
Behind the scenes: when the Storm topology detects that a query has reached sufficient popularity, it connects to a Thrift API that dispatches the query to Amazon's Mechanical Turk service, and then polls Mechanical Turk for a response.
For example: as soon as we notice "Big Bird" spiking, we may ask judges on Mechanical Turk to categorize the query, or provide other information (e.g., whether there are likely to be interesting pictures of the query, or whether the query is about a person or an event) that helps us serve relevant Tweets and ads.
Finally, after a response from an evaluator is received, we push the information to our backend systems, so that the next time a user searches for a query, our machine learning models will make use of the additional information. For example, suppose our evaluators tell us that [Big Bird] is related to politics; the next time someone performs this search, we know to surface ads by @barackobama or @mittromney, not ads about Dora the Explorer.
Texas and Chile have remarkably similar flags (though Chile got theirs first, by a matter of decades) and Texas doesn’t have a Unicode-defined emoji for its flag (just a sprinkling of proprietary ones that do not cross platforms gracefully), so Texans have taken to using the Chilean flag emoji as a shorthand for the longhorn […]
It’s one thing to set up an “anonymous” Twitter Hulk account whose anonymity your friends and colleagues can’t pierce, because the combination of your care not to tweet identifying details, the stilted Hulk syntax, and your friends’ inability to surveil the global internet and compel phone companies to give up their caller records suffice for […]
Techdirt, a fearless source of excellent technology news and commentary, is being sued for $15M by Shiva Ayyadurai, who claims to have invented email — he is represented by Charles Harder, a key figure in the Gawker-killing legal campaign that Peter Thiel financed, and who is also representing Melania Trump in her $150m lawsuit against […]
Python is immensely popular in the data science world for the same reason it is in most other areas of computing—it has highly readable syntax and is suitable for anything from short scripts to massive web services. One of its most exciting, newest applications, however, is in machine learning. You can dive into this booming […]
Learning new skills is a great way to improve your resume and stand out from other candidates. Especially in a workforce in which many job-seekers have a wide variety of qualifications. With lifetime access to Virtual Training Company, you won’t have to choose a specific focus. You can pick up new expertise whenever you deem it […]
Instead of throwing out all the empties after your next party, why not transform them into some new DIY glassware? Cut back on waste and add some home ambiance with the Kinkajou Bottle Cutter and Candle Making Kit.The Kinkajou is designed as a clamp-on scoring blade to make precise cuts. Just slide a bottle in, tighten […]