Features Podcasts Family Video Comics Music Tech Science Books Film & TV Games ✚

Jill

Data Mining 101: Finding Subversives with Amazon Wishlists

Mark Frauenfelder at 4:19 pm Wed, Jan 4, 2006

— FEATURED —

THE LATEST

Guatemala: Archive of documents from Rios Montt genocide trial, overturned 10 days after guilty verdict

Feature

Eurovision 2013: An American in London

Book Review

The Twelve-Fingered Boy - mesmerizing YA horror novel

Book Review

Black Code: how spies, cops and crims are making cyberspace unfit for human habitation

— FOLLOW US —

Boing Boing is on Twitter and Facebook. Subscribe to our RSS feed or daily email.

 

— POLICIES —

Except where indicated, Boing Boing is licensed under a Creative Commons License permitting non-commercial sharing with attribution

 

— FONTS —

Tweet
Kindle
Frequent Make contributor Tom Owad just published a mind-blowing how-to on his website explaining how to mine Amazon's wish list database to uncover "subversives."
Using a pair of 5-year-old computers, two home DSL connections, 42 hours of computer time, and 5 man hours, I now had documents describing the reading preferences of 260,000 U.S. citizens.

I downloaded all the files to an external 120 GB Firewire drive in UFS format. The raw data occupied little more than 5 GB. I initially wanted to move all the files into a single directory to facilitate searching, but as the directory contents exceeded 100,000 items, the speed became glacially slow, so I kept the data divided into chunks of 25,000 wishlists.

Next comes the fun part – what books are most dangerous? So many to choose from. Here's a sample of the list I made. Feel free to make up your own list if you decide to try some data mining. Send it to the FBI. I'm sure they'll appreciate your help in fighting terrorism.

Link

Reader comment: Anonymous says: "[This] method for grabbing the wishlists is overly complicated. Amazon's web services API allows programmatic access to wishlist information, making it even easier for a savvy programmer to quickly compile a list of customers interested in certain books."

Mark Frauenfelder is the founder of Boing Boing and the editor-in-chief of MAKE and Cool Tools. Twitter: @frauenfelder. Come and hear Mark speak at the ALA conference in Chicago on July 1.

More at Boing Boing

Eurovision 2013: An American in London

The technology that links taxonomy and Star Trek

Comments are closed.