Promise and peril of data-scraping


4 Responses to “Promise and peril of data-scraping”

  1. Fnarf says:

    Ah, the joys of “information wants to be free”. Just wait until some loser murders his wife in her domestic violence shelter whose location used to be shielded from public view. Google recently had ours on display in their directory, with a friggin’ photo of the front door. Getting it removed was extremely difficult, because mortals can’t contact Google; they contact you. Like the gods.

  2. Jeff says:

    This was a good article, and I wonder where the privacy issue will take us. Charles Stross has made an issue out of all the CCTs in London, watching your every move. Which would be fine with me if that happens in all cities. Privacy is something I don’t think I’m going to have more of in the future, but less.


  3. Adam Weiss says:

    Once upon a time, I scraped an external archive of craigslist and made some message frequency/time by category plots. The external archive contained data from 1998 through 2001 making for some very interesting pictures of a dot.boom and bust. Slashdot got wind of it and tens of thousands of folks came to look at the graphs.

    Later I wanted to legitimize it and plot data in realtime. (Given the flip-flopping of the dotconomy over the past five years, it would have been very interesting.)

    In a foolish choice to be a “good Internet citizen”, I e-mailed the craigslist folks to ask for permission. I figured “Hey. Any community minded organization wouldn’t mind, especially an organization like craigslist.” That was true of Craig Newmark as he tried to help. However, their douchebag of a CEO wouldn’t budge, claiming that there was absolutely no way I could possibly collect the data without crashing their servers. (Even when I suggested well spaced RSS pulls, he rudely said no.)

    It really made me realize that no matter how much these guys like to present themselves as “community minded,” at the end of the day, when it comes to their data, they’re obnoxious closed businessmen all the same.

  4. Adam Weiss says:

    CORRECTION: I just looked at the old e-mail thread where this transpired. It wasn’t the CEO that put the kabash on this, it was one of their techies. Nevertheless, he was on the thread and never intervened, which is almost but not quite as lame as I made him out to be in my prior comment.

Leave a Reply