On Twitter's engineering blog, a fascinating description of how Twitter uses a blend of machine intelligence and Mechanical Turk tasks to figure out, in real time, what is going on in the world:
Before we delve into the details, here's an overview of how the system works.
- First, we monitor for which search queries are currently popular.
Behind the scenes: we run a Storm topology that tracks statistics on search queries.
For example, the query [Big Bird] may suddenly see a spike in searches from the US.
- As soon as we discover a new popular search query, we send it to our human evaluators, who are asked a variety of questions about the query.
Behind the scenes: when the Storm topology detects that a query has reached sufficient popularity, it connects to a Thrift API that dispatches the query to Amazon's Mechanical Turk service, and then polls Mechanical Turk for a response.
For example: as soon as we notice "Big Bird" spiking, we may ask judges on Mechanical Turk to categorize the query, or provide other information (e.g., whether there are likely to be interesting pictures of the query, or whether the query is about a person or an event) that helps us serve relevant Tweets and ads.
- Finally, after a response from an evaluator is received, we push the information to our backend systems, so that the next time a user searches for a query, our machine learning models will make use of the additional information. For example, suppose our evaluators tell us that [Big Bird] is related to politics; the next time someone performs this search, we know to surface ads by @barackobama or @mittromney, not ads about Dora the Explorer.
Improving Twitter search with real-time human computation
We've talked here before about the crazy things you can find when you read the "Methods" section of a scientific research paper. (Ostensibly, that's the boring part.)
If you want a quick laugh this morning — or if you want to get a peek at how the sausages are made — check out the Twitter hashtag #overlyhonestmethods, where scientists are talking about the backstory behind seemingly dry statements like "A population of male rats was chosen for this study".
Warren Ellis, always a shrewd observer of online media, supposes that we've reached peak social media, the point at which exciting new communications forms ossify into dull media titans:
Twitter alters its terms of access to its information, thereby harming the services that built themselves on that information. Which was stupid, because Twitter gets fewer and fewer material benefits from allowing people to use its water. And why would you build a service that relies on a private company’s assets anyway? Facebook changes its terms of access regularly. It’s broken its own Pages system and steadily grows more invasive and desperate. Instagram, now owned by Facebook, just went through its first major change in terms of service. Which went as badly as anyone who’s interacted with Facebook would expect. As Twitter disconnected itself from sharing services like IFTTT, so Instagram disconnected itself from Twitter. Flickr’s experiencing what will probably be a brief renaissance due to having finally built a decent iOS app, but its owners, Yahoo!, are expert in stealing defeat from the jaws of victory. Tumblr seems to me to be spiking in popularity, which coincides neatly with their hiring an advertising sales director away from Groupon, a company described by Techcrunch last year as basically loansharking by any other name.
This may be the end of the cycle that began with Friendster and Livejournal. Not the end of social media, by any means, obviously. But it feels like this is the point at where the current systems seize up for a bit. Perhaps not even in ways that most people will notice. But social media seems now to be clearly calcifying into Big Media, with Big Media problems like cable-style carriage disputes. Frame the Twitter-Instagram spat in terms of Virginmedia not being able to carry Sky Atlantic in the UK, say (I know there are many more US examples).
His closing remark is "I wonder if anyone’s been thinking twice about giving up their personal websites." Good question.
The Social Web: End Of The First Cycle
In "Credibility ranking of tweets during high impact events," a paper published in the ACM's Proceedings of the 1st Workshop on Privacy and Security in Online Social Media , two Indraprastha Institute of Information Technology researchers describe the outcome of a machine-learning experiment that was asked to discover factors correlated with reliability in tweets during disasters and emergencies:
The number of unique characters present in tweet was positively correlated to credibility, this may be due to the fact
that tweets with hashtags, @mentions and URLs contain
more unique characters. Such tweets are also more informative and linked, and hence credible. Presence of swear words
in tweets indicates that it contains the opinion / reaction of
the user and would have less chances of providing informa-
tion about the event. Tweets that contain information or
are reporting facts about the event, are impersonal in nature, as a result we get a negative correlation of presence of
pronouns in credible tweets. Low number of happy emoticons [:-), :)] and high number of sad emoticons [:-(, :(] act
as strong predictors of credibility. Some of the other important features (p-value < 0.01) were inclusion of a URL in
the tweet, number of followers of the user who tweeted and
presence of negative emotion words. Inclusion of URL in a
tweet showed a strong positive correlation with credibility,
as most URLs refer to pictures, videos, resources related to
the event or news articles about the event.
Of course, this is all non-adversarial: no one is trying to trick a filter into mis-assessing a false account as a true one. It's easy to imagine an adversarial tweet-generator that suggests rewrites to deliberately misleading tweets to make them more credible to a filter designed on these lines. This is actually the substance of one of the cleverest science fiction subplots I've read: in Peter Watt's Behemoth, in which a self-modifying computer virus randomly hits on the strategy of impersonating communications from patient zero in a world-killing pandemic, because all the filters allow these through. It's a premise that's never stopped haunting me: the co-evolution of a human virus and a computer virus.
Credibility Ranking of Tweets during High Impact Events [PDF]
What's the deal with texting? Are you being sarcastic? Are you mad at me? Are you typing this while on the toilet? I don't wanna be a meme! Did you ever stop to think about how incredibly perfect Seinfeld would be in today's social media-crazed world? Thanks to the newly formed Modern Seinfeld Twitter account, you can get a 140-character (or less) idea at what a current episode of the "Show About Nothing" would cover. And when you consider all the "nothing" we do all day and how much awkward human behavior it causes, Seinfeld could probably find enough material to last twenty years. (via Twitter)
A monkey in a nice coat escaped from a cage inside its owner's car, opened the car door, and strolled into an Ikea in North York, a suburb of Toronto. The monkey was removed shortly thereafter. I have been stuck in that Ikea and I can testify that whatever your feelings about the ethics of keeping a pet monkey (or sticking it in a cage in your car), it is certainly a mercy to remove the monkey from that Ikea.
The incident spawned two parody Twitter accounts: @IKEAmonkey and @Ikea_Monkey, the former being more prolific (and having made overtures of peace and cooperation to the latter, without, it seems, any success).
At around 3 p.m. ET, the diminutive primate was spotted in the store’s upper parking lot, where it was cornered by several Ikea staff members, who also called animal control to come retrieve the monkey.
Mysterious monkey in posh miniature winter coat found alone at Toronto Ikea [National Post]
Umm saw a monkey in the #ikea parking lot. by #broniewyn)
Latvian magazine Ir created a Twitter account written by local birds by covering the keys of an outdoor keyboard with unsalted fat, and using the birds' pecking to generate 100 tweets a day to the @hungry_birds account.
Everyone has the right to be heard - that is the main principle of Ir, weekly magazine from Latvia.
That´s why we have fixed the biggest internet injustice of all times and gave Twitter back to original twitterers, the birds!
What you see here is being streamed live from Sarnate, a small village on the west coast of Latvia.
We put a layer of unsalted fat on a keyboard. Eating the fat helps the birds to survive the harsh winter days and nights when the temperature can drop to 20C below zero.
@hungry_birds are awake from 05.00 until 16.00 GMT, but they have other daily activities and duties besides eating, so be patient and have fun!
Birds on Twitter
@AuthenticWmGibs is a funny fake William Gibson Twitter account, which tweets plausible-sounding precis of imaginary Gibson novels (or, as the Twitter bio has it, "Synopses for William Gibson novels that are definitely 100% real, but only in a timeline with greater authenticity than this one.")
The @ElBloombito Twitter account is a running -- and hilarious -- sendup of NYC Mayor Michael Bloomberg's terrible Spanish. Salon's Mary Elizabeth Williams profiled Rachel Figueroa-Levin, the mastermind behind the account.
In the past two days, El Bloombito’s pidgin Español Twitter stream has been a balm to disaster-scarred New Yorkers, a bracingly funny respite from the ravages of Sandy. Prior to the storm, it was Bloombito who warned New Yorkers, “Cuidado! El stormo somos about to que vamos el lañdfall! Batteño los hatches!” and “Por favor to remaiño insidero until notice de furthero. Peligroso!” Afterward, it was Bloombito who reminded, “El floodo agua esta still todos los everywhere. Necesitos los gearo de scuba y el flipper!”
Speaking to Salon while her toddler daughter takes a post-Sandy afternoon nap, Figueroa-Levin says El Bloombito originally “gave me something to do while I was stuck inside” during Irene. As it happened, the account attracted an instant following — and the attention of Mike Bloomberg himself — who admitted last year that “Es difícil para aprender un nuevo idioma.”
“I don’t know why he does it,” Figueroa-Levin says. “Not that my Spanish is that fantastic, but I live in a neighborhood where it’s common. I grew up hearing it. I’m Puerto Rican. And I don’t know who he thinks he’s talking to. In fact, last year I had an elderly Dominican neighbor tell me he thought Bloomberg was Italian.”
Meet the woman behind “El Bloombito”
(via Making Light)
Great news, you guys! Soon, you'll be able to tweet iPhone or Android snapshots of your sandwich in sepia, without even having to download Instagram. Nick Bilton at the NYT
got the scoop. — Xeni
People give Twitter plenty of guff, but at least its promoted tweets program is straight-up advertising--unlike the awful "pay to reach your own followers" stunt that Facebook is pulling.
Last December, Nicole J Caruth posted this photo of a "Mondrian cake" to her Twitter stream. What a fabulous piece of work!
Finally trying the Mondrian cake
Amanda Palmer was musing about the messed up state of US health insuranceso she took to Twitter, writing about it under the #InsurancePoll tag ("quick twitter poll. 1) COUNTRY?! 2) profession? 3) insured? 4) if not, why not, if so, at what cost per month (or covered by job)?"). The tag's blown up, trending across the USA, as people weigh in with their insurance horror stories. Then a volunteer statistician came forward to compile a report on the data generated by the poll. They're looking for lots more people to step forward and participate.
runaway twitter insurance poll & the power of social media & sharing stories
i’ll post the gathered data as soon as it’s ready. the results, as DM’d to me a few hours ago by @aubreyjaubrey:
– preliminary info from first 156 responses indicates 24.5% of US respondents do not have insurance because of cost.
– 31.4% of responses were from outside of US. all but one person had some kind of compulsory of government supported healthcare – (that one person was denied)
– 24.4% of those abroad have some employer/private insurance for optometry and dental. individual costs from $45-$90/month. around $250/mo for a family.
– based on responses, Germany appears to be the only other country with extortionate health care costs.
a few hours ago aubrey posted she was off to bed but would continue today and that so far, 240 sets of data had been entered.