Turing paper into ASCII

Gary Wolf has a wonderful feature in this month's Wired about the parallel efforts to put texts, indices and images of books on the net (and to render them in cheap wood-pulp substrate) from the Internet Bookmobile to the Amazon Search Inside the Book system:
Kahle is happy to sidestep the problem of digitizing commercially successful books. He has no wish to antagonize the publishing industry. What he hates is that the Million Book Project cannot legally digitize countless books that aren't generating money for anybody. US libraries hold about 30 million unique volumes. No one knows how many of those books continue to be protected by copyright or are available from commercial publishers. Still, Kahle says, "they can't be digitized because the copyrights can't be cleared, and the copyrights can't be cleared because it's too much work to identify the copyright holders. Some people call them abandonware. I call them orphans."

"Amazon is taking a cut at the commercially available titles," continues Kahle. "We are going for the public domain titles. But who is taking care of the orphans? Nobody."