Guestblogger Paul Spinrad is a freelance writer/editor, and is Projects Editor for MAKE magazine. He is the author of The VJ Book and The Re/Search Guide to Bodily Fluids, and was an early contributor to bOING bOING when it was an online zine. He lives in San Francisco.
It's fantastic that so much written knowledge is becoming generally accessible and cross-linked these days, but this is just an intermediate stage-- a universal library on the way to becoming a universal brain. The missing piece is encoding the underlying meaning of the stored text, the deep-structure logic behind it. It's one of the oldest challenges in Computer Science, and there has been lots of progress and companies dedicated to doing this. Powerset, for example, has software that has parsed and can answer questions from all of Wikipedia.
The thing is, you really still need a person to get it most reliably right, because people understand the way the world works. Luckily, we already have people whose job is very close to doing this already-- they're called fact-checkers or researchers, and they work for every reputable publication.
I don't think the fact-checking process is very well understood by the public-- it's hidden from view and uncredited (which is lame), and I didn't understand it myself until I began working with magazines. Basically, someone combs through a piece of text and makes sure every fact is verified. They look things up in established references, they call people on the phone, they call their friends who have experience in some area, or whatever else it takes. If they're doing it on paper, they start with a printout of the article, and then when they're done every word, every clause, and every spelling of every proper name, has a pencil mark through it.
I have wondered for years, as magazines, newspapers, and other news organizations have been hemorrhaging money and employees, why someone hasn't gone into the contract fact-checking business. Like, it could be an extension of Snopes.com. There's a huge redundancy in every publication having their own research desks, so they could lay off all of their fact-checkers and then outsource the job to the new, independent company that the best of them then all go to work for. Meanwhile, the company could also be hired by anyone else. Then, when the public sees the "Fact-Checked by MiniTrue (SM)" seal on someone's independent blog, they know the information there has the same credibility as the big boys.
Now, what if these fact-checkers didn't just vet and correct the text? While they dig into the logic and accuracy of everything, as usual, they could also use some simple application to diagram the sentences and disambiguate the semantics into a machine-friendly representation. Just a little extra clicking, and they could bind all the pronouns to their antecedents, and select from a dropdown box to specify whether an instance of the string "Prince" refers to the musician Prince or to Erik Prince-- the president of XE, the company formerly known as Blackwater-- within an article that for whatever reason mentions both of them.
Then you would really have something. The text wouldn't just be fact-checked; its underlying meaning could be added into a shared pool of human knowledge, chained through, verified or denied, and used in other ways by any technology that may now exist or may exist in the future.
Many of big ideas that computer visionary Douglas Engelbart came up with in the 1960's have come true, but a couple of them haven't yet. One of these is his notion of the "Certified Public Logician." Engelbart predicted that a new class of knowledge worker would act as front-ends to the machine-enabled collective intelligence. Part logician, part notary, these "Certified Public Logicians" would review texts for logical consistency and then tag them up with appropriate envelope information and enter them into the machine. It's a great idea, and I think we could promote all of our fact-checkers into Certified Public Logicians pretty easily.