Stephen Levy, one of the great technology journalists, got an unprecedented inside look at Google's search algorithm and wrote up his experience in a long, fascinating Wired feature. Reading this piece, there were several a-ha moments for me as it helped me understand what was going on beneath the hood when I run my queries in my little search-box.
Google is famously creative at encouraging these breakthroughs; every year, it holds an internal demo fair called CSI -- Crazy Search Ideas -- in an attempt to spark offbeat but productive approaches. But for the most part, the improvement process is a relentless slog, grinding through bad results to determine what isn't working. One unsuccessful search became a legend: Sometime in 2001, Singhal learned of poor results when people typed the name "audrey fino" into the search box. Google kept returning Italian sites praising Audrey Hepburn. (Fino means fine in Italian.) "We realized that this is actually a person's name," Singhal says. "But we didn't have the smarts in the system."
The Audrey Fino failure led Singhal on a multiyear quest to improve the way the system deals with names -- which account for 8 percent of all searches. To crack it, he had to master the black art of "bi-gram breakage" -- that is, separating multiple words into discrete units. For instance, "new york" represents two words that go together (a bi-gram). But so would the three words in "new york times," which clearly indicate a different kind of search. And everything changes when the query is "new york times square." Humans can make these distinctions instantly, but Google does not have a Brazil-like back room with hundreds of thousands of cubicle jockeys. It relies on algorithms.
(via Beyond the Beyond)
- Newsweek's Stephen Levy: Capitol Hill P2P Prohibition craziness ...
- Suing Google over fixing its algorithm - Boing Boing
- Wiki-inspired "transparent" search-engine - Boing Boing
- Boing Boing: Goofy algorithm generates web page about "Prostitute ...
- Dianetics gaming Google - Boing Boing
- Do you remember your first Google? - Boing Boing
- Newsmap: visualizing the media tree through Google News - Boing Boing