NYT: Internet activist accused of data theft

Internet activist Aaron Swartz, formerly of Reddit and Wired Digital, was indicted Tuesday on charges of data theft. The district attorney in Boston claims that he "stole" millions of JSTOR documents while at M.I.T., crimes that could put him in jail for 35 years. Here's Nick Bilton in the New York Times:

In a press release, Ms. Ortiz's office said that Mr. Swartz broke into a restricted area of M.I.T. and entered a computer wiring closet. Mr. Swartz apparently then accessed the M.I.T. computer network and stole millions of documents from JSTOR.

In a press release, Demand Progress, the political action group founded by Swartz, denies the prosecutor's claims outright: "As best as we can tell, he is being charged with allegedly downloading too many scholarly journal articles from the Web" and compares it to "checking too many books out of the library."

JSTOR is an online archive of print journals, containing millions of articles.

The prosecutor's language here is unequivocal: that he "broke in" to a "restricted area" to gain access to a "wiring closet" that would enable a mass data theft. The criminal complaint [via Jason Levine and Anil Dash] suggests most of the theft, however, was accomplished using scraper software to download en-masse stuff over the web, from a website he already had access to.

"Swartz used the Acer laptop to systematically access and rapidly download an extraordinary volume of articles from JSTOR. He used a software program to automate the downloading process so that a human being would not need to keep typing in the archive requests."

The trip to the wiring closet happened after JSTOR finally blocked that technique:

On January 4, 2011, Aaron Swartz was observed entering the restricted basement
network wiring closet to replace an external hard drive attached to his computer. On January 6, 2011, Swartz returned to the wiring closet to remove his computer equipment. This time he attempted to evade identification at the entrance to the restricted area.
As Swartz entered the wiring closet, he held his bicycle helmet like a mask to shield his face, looking through ventilation holes in the helmet. Swartz then removed his computer equipment from the closet, put it in his backpack, and left, again masking his face with the bicycle helmet before peering through a crack in the double doors and cautiously stepping out.

Needs a theme tune by Henry Mancini.

Note: the NYT originally reported that Swartz was a co-founder of Reddit, referenced here in an earlier headline. I've updated this post to reflect its update: Swartz joined Reddit early but not as a founder.