A deep dive into the race to preserve our digital heritage

Science Friday's beautiful "File Not Found" series looks at the thorny questions of digital preservation: finding surviving copies of data, preserving the media it is recorded upon, finding working equipment to read that media, finding working software to decode the information once it's read, clearing the rights to archive it, and maintaining safe, long term archives -- all while being mindful of privacy and other equities.

I'm quoted in the piece, but regret that I wasn't clearer. I think that modern data is actually a lot simpler to preserve than older data, because of the growth in both cloud and local, online storage. The data I had to store on floppies (before I had a hard drive, and then after I got one but before I could afford a drive that was capacious enough to maintain all my data) is vulnerable because it's on media that is slowly decaying and whose reading equipment is getting harder and harder to source.

But once the floppies and cards and tapes and cartridges are read into the primary storage for computers in constant use, it gets a lot more robust. Backing up that data gets easier and easier (I maintain two encrypted hard drives with backups, only one of which is onsite, and which are rotated; as well as an encrypted cloud backup of key data), and running programs that can interpret the data has effectively ceased to be a problem because I can use virtual machines running obsolete operating systems and the original programs to see, copy and manipulate the data.

As fast as my personal data is growing, the cost of the main drive I use in my daily-use computer is dropping faster. Every year, I have much more long-term data, and every year I have much more headroom on the drive in my laptop.

In a future world of obsolescence, digital objects of the past could be lost due to the disappearance of the original programs and machines that read them—the data gobbled up by the informational black hole. For example, files once readable on floppy disks become increasingly harder to retrieve as readers become more of a vintage item.

“We’re like nautiluses,” says Cory Doctorow, science fiction writer and author of books like Walkaway and Little Brother. “We go from one device to the next and the next one because storage keeps getting so cheap, [and] has twice as much storage [as] the last ones we had.”

The result: a trail of vulnerable data, which can become unsalvageable if not maintained properly. Not all data may need to be preserved, but the potential loss and disappearance of certain information could pose a risk to maintaining a coherent picture of the digital age. And with our current fixation on upgrading hardware every few years, the problem is getting worse.

In order to prevent us from spiraling further into the informational black hole, researchers are on the hunt for ways to immortalize history—a system to eternalize data forever.

1. Ghosts In The Reels [Lauren J Young/Science Friday]

2. The Librarians Saving The Internet [Lauren J Young/Science Friday]

3. Data Reawakening [Lauren J Young/Science Friday]