Comprehensive, open tutorial on using data analysis in social science research

You are a Gmail user

For years, Benjamin Mako Hill has paid to host his own mail, as a measure to enhance his privacy and independence from big companies. But a bit of clever analysis of his stored mail reveals that despite this expense and effort, he is a Gmail user, because so many of his correspondents are Gmail users and store copies of his messages with Google. And thanks to an archaic US law, any message left on Gmail for more than six months can be requested by police without a warrant, as it is considered "abandoned."

Mako has posted the script he used to calculate how much of his correspondence ends up in Google's hands.

I host my own mail, too. I'm really looking forward to Mailpile, which should make this process a lot easier, and also make keeping all my mail encrypted simpler. Knowing that Google has a copy of my correspondence is a lot less worrisome if they can't read it (though it's still not an ideal situation). Read the rest

What makes a project remixable?

In The remixing dilemma: The trade-off between generativity and originality [PDF], a paper just published in American Behavioral Scientist, Benjamin Mako Hill and Andrés Monroy-Hernández analyzed a data-set of projects from the Scratch website that had been made available for download and remixing. They were attempting to identify the formalattributes that made some projects more likely to attract remixers. As Mako describes in this summary, they found that the projects that were most remixed were neither overly complex (too intimidating) and finished, nor overly vague and undefined (too uninspiring). The Scratch dataset was a good one to study here, because it includes the number of times each project was viewed as well as the number of remixes it inspired, allowing the authors to calculate the probability that a project will inspire a remix while controlling for its overall popularity:

To test our theory that there is a trade-off between generativity and originality, we build a dataset that includes every Scratch remix and its antecedent. For each pair, we construct a measure of originality by comparing the remix to its antecedent and computing an “edit distance” (a concept we borrow from software engineering) to determine how much the projects differ.

We find strong evidence of a trade-off: (1) Projects of moderate complexity are remixed more lightly than more complicated projects. (2) Projects by more prominent creators tend to be remixed in less transformative ways. (3) Cumulative remixing tends to be associated with shallower and less transformative derivatives. That said, our support for (1) is qualified in that we do not find evidence of the increased originality for the simplest projects as our theory predicted.

Group whose Wikipedia entry was deleted for non-notability threatens lawsuit against Wikipedian who participated in the discussion

Benjamin Mako Hill writes, "Last year, I participated in a discussion on Wikipedia that led to the deletion of an article about the "Institute for Cultural Diplomacy." Because I edit Wikipedia using my real name, the ICD was able to track me down. Over the last month or so, they threated me with legal action and have now gotten their lawyers involved. I've documented the whole sad saga on my blog. I think the issue raises some important concerns about Wikipedia in general."

Donfried has made it very clear that his organization really wants a Wikipedia article and that they believe they are being damaged without one. But the fact that he wants one doesn’t mean that Wikipedia’s policies mean he should have one. Anonymous editors in Berlin and in unknown locations have made it clear that they really want a Wikipedia article about the ICD that does not include criticism. Not only do Wikipedia’s policies and principles not guarantee them this, Wikipedia might be hurt as a project when this happens.

The ICD claims to want to foster open dialogue and criticism. I think they sound like a pretty nice group working toward issues I care about personally. I wish them success.

But there seems to be a disconnect between their goals and the actions of both their leader and proponents. Because I used my real name and was skeptical about the organization on discussion pages on Wikipedia, I was tracked down and threatened. Donfried insinuated that I was motivated to “sabotage” his organization and threatened legal action if I do not answer his questions.

Antifeatures: deliberate, expensive product features that no customer wants

Free software advocate Benjamin Mako Hill's lecture on "Antifeatures" for the Free Technology Academy is a fascinating look at the ubiquitous "antifeature" -- that is, a deliberately designed product feature that none of the product's users desire. Examples include cameras that block saving images as RAW files, phones that are designed to identify and drain third-party batteries, and, of course, printers that are designed to reject third-party ink.

Mako makes a compelling case that these sorts of features are endemic to proprietary technology, and that free and open technology are the antidote to them.

LilyPad microcontroller's success in welcoming women to electronics

MIT's Leah Buechley and Benjamin Mako Hill recently published a paper called LilyPad in the Wild: How Hardwareʼs Long Tail is Supporting New Engineering and Design Communities, about the success of the LilyPad microcontroller in attracting women to electronics projects. LilyPad is derived from the Arduino open processor, but was "specifically designed to be more useful than other microcontroller platforms (like normal Arduino) in the context of crafting practices like textiles or painting." The Buechley/Hill paper shows that this was a successful strategy for engaging women makers and contemplates how to use the LilyPad approach to engage with women and girls in other science/technology/engingeering/math (STEM) domains:
Our experience suggests a different approach, one we call Building New Clubhouses. Instead of trying to fit people into existing engineering cultures, it may be more constructive to try to spark and support new cultures, to build new clubhouses. Our experiences have led us to believe that the problem is not so much that communities are prejudiced or exclusive but that they're limited in breadth--both intellectually and culturally. Some of the most revealing research in diversity in STEM found that women and other minorities don't join STEM communities not because they are intimidated or unqualified but rather because they're simply uninterested in these disciplines.

One of our current research goals is thus to question traditional disciplinary boundaries and to expand disciplines to make room for more diverse interests and passions. To show, for example, that it is possible to build complex, innovative, technological artifacts that are colorful, soft, and beautiful.

Hackers' wedding vows based on Pi and Phi

My friend Mako got married recently; he's a hacker and so's his new wife, Mika, and they exchanged vows of mathematical significance: "the numbers of letters in each word in each vow matches consecutive digit in the decimal expansion of a famous mathematical constant." Mika chose Pi, Mako chose Phi. Here's the Pi vow:
Now, I give a total offertory to joyful union.

I'll honor - joyously, endlessly, loyally, devotedly - you.

In the marriage that unites us, paired Yang and Yin, Benjamin and me, forever soulmates, shall complement as partners steadily.

With a doubtless promise, I pledge integrity and stability sincerely. Our rounded rings, a completely noble treasure; it represents continual respect, love, perpetual link with trust, limitless.

My vow: absolutely lasting devotion.

