Archiving every Podcast

Jason Scott is the archvist whose textfiles.org contains copies of every text-file that circulated on massive world of BBSes in the pre-Internet days. He's launched a new project: archiving every single Podcast ever made. It's only 75GB so far, but growing fast. Jason's explanation of why he's decided to do this is inspiring, a call-to-arms to preserve digital culture.

Obviously, I need some space to store all these podcasts, but space, these days, is very cheap. I watch sites that provide specials for hardware, and can purchase a 250 gigabyte hard drive for $100. It's a drive type that is prone to failure, so I buy two. At home, I run these drives on USB2 enclosures, on two separate machines, and I use a program called rsync to keep them synchronized. I download podcasts using a program called doppler, which has several advantages to its approach that are useful for archiving. I have the podcasts on a network drive, so I am not beholden to a specific machine to download the podcasts. I found very quickly that Doppler Radio didn't check to see if you had pointed it to multiple copies of the same feeds (it assumes you're using such a small amount of feeds, that you would always notice the doubles yourself), so I wrote a perl script that yanked out doubles. This has held up for the time being, and while I don't have firm numbers on how much disk space per day this process is taking, I'm not too worried about it…

Podcasting certainly has its roots in zine culture, home-brew tapes, BBSes, carbon-copy SF fanzines, and telegraph. If that's too high-minded and artsy-historian, then I could point to the direct event of the fad of "Push Technology" that infected a number of companies in 1998 through to 1999. Microsoft and Netscape both claimed that Push technology would change everything, and Pointcast tried to build a business on it. Really, it was all a fine idea, but the order of the day was to claim that not only was a good idea good, but it would actually turn dog poop into solid gold, so the actuality had issues with the (stock-driven) promises.

Link

(via Waxy)