Twitter claims a 90 percent accuracy rate for the clever techniques it uses to learn the gender of any given user. Glenn Fleishman reports on the company's disconcerting new analytics tools, the research behind them, and how large a pinch of salt they come with.

Twitter opened its analytics platform to every user on August 27, allowing all of us — not just verified users and those with advertising accounts — to track how many people viewed and acted upon our tweets. But the "Followers" section, revealing demographics, provoked the most discussion. Alongside breakdowns in followers' interests and location is a gender bar that splits followers into male and female.

Many women were surprised how many of their followers — whether they had hundreds or thousands — were men. The ratio is often in the 75 to 80 percent range, and it's easy to find thousands of tweets reporting so. Some of the authors work in tech, a male-dominated industry, but others tweet about other subjects. Why such a heavy skew there?

Forget for a moment the problem, in 2014, with offering a simple duality for gender, which brings with it biases and assumptions.

How can Twitter offer this information when it doesn't ask for you to indicate a gender when you sign up for an account?

The service analyses our tweets and uses word choice, proximity, and other factors to make a guess. According to a 2012 post on its Advertising blog, the company relies on multiple signals to assign confidence to a gender selection. For matches with a high confidence level, Twitter tested its results against a global panel of humans found its approach 90 percent accurate. The post noted, "…where we can't predict gender reliably, we don't." (Twitter didn't reply to a request for information for this article.)

As Robin James, a professor of philosophy and women's/gender studies at the University of North Carolina–Chapel Hill, tweeted recently, "The Twitter analytics method of reading gender just shows social identity isn't bodily features anymore, but behavioral patterns."

Twitter's marketing research didn't come out of a vacuum. Researchers have long analyzed cues that arise out of modes of expression to determine personal characteristics that aren't explicitly mentioned or known to a reader. A 2002 book, revised in 2013, Reading, Writing, and Talking Gender in Literacy Learning, has a chapter that identifies and summarizes 42 studies mostly from the 1990s examining marks of gender in student writing, and it's just scratching the surface of the literature.

It's no surprise that such work would be extended when massive corpuses could be analyzed and then checked for accuracy using control cases in which gender was known, as when the writer (on a blog, social network, or other public platform) provided explicit details about themselves. This allows refinement on a scale never before possible.

One researcher, Delip Rao, was the lead author on several papers during his time at the Human Language Technology Center of Excellence at Johns Hopkins University that dealt with algorithmic methods of identification. Many of his co-authored papers talk about "latent attributes," those implicit specific details about people that can be surfaced, including ethnicity and gender.

Some of the "tells" in tweets and other messages are the stereotypical ones that would leap to mind, and we shouldn't be surprised that they test out as valid. In one paper from 2010, the researchers note that "OMG" is used four times as often by women than men in the dataset of Twitter messages they tested. The phrase "my zipper" has an extremely high predictive value for men, while "my yoga" has the same effect for women. The paper even notes, "People laugh differently on Twitter as well. While women LOL, men tend to LMFAO."

The MITRE Corporation released a much-cited study in May 2011 that attempted to predict gender and other factors, and found a range of 76 percent accuracy using the text of tweets alone and 92 percent using tweets, the account description, screen name, and full name (as provided). A September 2013 paper examined "700 million words, phrases, and topic instances collected from the Facebook messages of 75,000 volunteers." While the authors focused mostly on "psychological insights," based in part on conducting personality tests on all the volunteers, they conclude an accuracy rate of 91.9 percent for gender using "language" alone, and not including other data they gathered.

Rao went to work for Twitter in October 2011, and there's been a lot less research in the field in the two years since, with the 2012 and 2013 papers the only ones widely cited. It's possible Twitter, Facebook, and others recruited other researchers with similar interests, which would explain the paucity of papers, since there's such a huge financial reward associated with precise targeting of attributes.

In its 2012 post, the company said it only asserts gender when it is "reliable," but we don't know what percentage of the time that was; and of the reliable data, Twitter was wrong 10 percent of the time, assigning the incorrect gender. That could have meant an error rate of 20 percent or more. But let's assume this research has only improved, and Twitter now has an extremely high confidence and accuracy, which it measures above 90 percent. But things still don't add up.

Leaving aside people who don't identify as either strictly or solely male or female, or who reject those gender constructs, there are accounts run by teams and by individuals representing entities or publications, joke accounts, people who specifically crosstweet — using a name, avatar, description, and other factors that don't represent their identified gender — and bots, which may provide a sense of gender, but arguably possess none (yet) in a meaningful manner. Twitter may be mostly accurate, but I have to believe the error bar is larger than they maintain. The Pew Research Center's "Social Media Update 2013," released at the end of last year, finds statistically equal numbers of men and women (self-identified) in America using Twitter. The women are there.

My over 15,000 followers are divided 81/19 male/female, and the numbers for each (which you can see when you hover over each bar) adds up to nearly my exact current total. This isn't that odd, I suppose, since I dad tweet, tech tweet, and pun tweet, all of which are likely to find more male peers. Colleague Lisa Oberndorfer, an Austrian tech journalist who did a recent long stint in San Francisco, says she splits 74/26 male/female.

But the results seem stranger for many of my female friends and colleagues, who, even if not in tech, have Twitter reporting 75 to 80 percent of their followers are men. When they examine their list of followers and with whom they interact, they find it implausible.

For instance, my buddy Swoozy Clancey (a nom de Twitter) lists "feminist killjoy" and "kill the kyriarchy" in her bio, but somehow has 75 percent male followers. Another friend, @MaddieSayWhat, a PhD candidate in counselor education and supervision, splits 72/28. However, Maddie offers one plausible explanation: people are more likely to follow those of a gender to which they are attracted, even if they aren't specifically attracted to that person. She notes that following a gender to which one isn't attracted is "less reward centery." That would help explain the ratio for women tweeters, but not necessarily for men.

Some women and men report much more equal ratios. Sarah Werner, the digital media strategist at the Folger Shakespeare Library, says her ratio is 52/48 male/female; she checked Folger Research's account, and it has a mirror: 53/47 female/male. Where people checked, the total number of followers combined for genders typically added to nearly the sum of all followers, as in my case, indicating extremely high confidence.

One other factor may be the absolute number of people that identified men and women follow. If men follow 30 percent more people than women on average, that could also account for a more general disparity.

One has to ask, after all this scrying of gender, whether it matters a bit? For advertisers who know their products skew to one gender or another, or produce better results with gender-tailored marketing for the same market, sure. For the rest of us, it's hard to say.

It can seem disconcerting when you think you're being listened to by an audience you imagine in one fashion, and it turns out to be another. But there's enough ambiguity in Twitter's numbers, and not enough information revealed, to take some percentage points of accuracy with a grain of salt.