Google search results are editorial, not (merely) mathematical

My latest Guardian column is "Google admits that Plato's cave doesn't exist," a discussion of how Google has changed the way it talks about its search-results, shifting from the stance that rankings are a form of pure math to the stance that rankings are a form of editorial judgment.

Google has, to date, always refused to frame itself in those terms. The pagerank algorithm isn't like an editor arguing aesthetics around a boardroom table as the issue is put to bed. The pagerank algorithm is a window on the wall of Plato's cave, whence the objective, empirical world of Relevance may be seen and retrieved.

That argument is a convenient one when the most contentious elements of your rankings are from people who want higher ranking. "We have done the maths, and your page is empirically less relevant than the pages above it. Your quarrel is with the cold, hard reality of numbers, not with our judgement."

The problem with that argument is that maths is inherently more regulatable than speech. If the numbers say that item X must be ranked over item Y, a regulator may decide that a social problem can be solved by "hard-coding" page Y to have a higher ranking than X, regardless of its relevance. This isn't censorship – it's more like progressive taxation.

Google admits that Plato's cave doesn't exist


    1. Because in pure search neutrality it discounts the fact that sites rig the game. Working in SEO, and seeing how completely useless page rankings are in finding good sites, gives you a greater insight into how it works. 

      1. Interesting. So we’ll be perpetually stuck in a game of search providers needing to “editorialize” search to correct for the evils of people gaming the algorithm?

        I’ll try DuckDuckGo, Andrew.

        Thanks to you both.

        1. Not eternally, just untill some search engine manages to properly quantify what a human finds to be quality content as opposed to keyword baiting nonsense.

          1. SEO and good content go hand in hand, and eventually the two will meet somewhere in the middle.  At which point the google SEO guide will be renamed as the Google English Language Style Guide, artificial journalist intelligence will be believable (at least for the written word) and human journalists for the most part will be out of work. I think that’s within reach of our lifetime. Google can already throw together a better edited news page than most newspapers, it’s not far off before computers are writing well written, easily understandable fact checked stories for the more mundane things. There’s only maybe one or two new stories a week that break requiring an actual human to begin collecting the facts, the rest are just updates and analysis of new evidence in many cases. 

    2. What does “search neutrality” even mean? I would honestly love to know what people think a “neutral” search engine looks like, since “relevance” is not an objective concept.

      No matter how you answer that, consider: If you don’t like the search results you’re seeing, you have the option to use a different search engine. There are hundreds of them out there, and getting to one is as easy as typing its name. Because of this, we can rely on the free market for search engines to prevent Google from altering results too much, because people will just go elsewhere, since it’s so easy to do so.

      Net neutrality is a completely different beast because most people (in the U.S. at least) only have the option to buy broadband access from one or perhaps a few providers. If all the providers in your area block or degrade Netflix, for example, you’re out of luck unless you’re willing to move house. Due to geographical monopolies, we can’t rely on the free market to take care of this issue, and this, in part, is why people feel the need for net neutrality regulation.

  1. I had always assumed that Google rankings are judged in an editorial way.   This assumption was based on knowing someone who worked for Google. from home, in the early days of the company’s service, doing just that.  Her task as she explained it was to view a list of search terms and the pages that resulted from that search, and rank the relevance and accuracy/usefulness of the return to the search terms.  Apparently there were many others like her, each with a broad category of search terms to review.  From what I could tell she enjoyed this work, even though her search category was mostly pornographic words and phrases.

    1. I wonder if they used the results from her, and people like her, to compare the effectiveness of their computer-generated results with what people would do, in an effort to provide measurements, fine tune, and improve the computer algorithms.  In other words, it’s possible, or even likely, that they’d employ humans for ranking popular search terms and still have it be driven by math.

  2. As an ex-SEO drone, this is nothing but good news. Hopefully, the further they implement the human judgement for quality of info, the less we will see in high rankings for junk like Yahoo Answers, AllExperts, and other babbling question/answer sites, alongside some of the savvier content farms who made it through the Panda culling. 

  3. I knew Google’s search results were editorial when I googled “Santorum” and the top results were about the candidate and not the frothy mix.

    1. There are people named after anal froth? Shouldn’t that have been caught at Ellis Island?

  4. Well, it can be both a consistent algorithm and an aesthetic choice.  Google chooses one algorithm over another because they like what it produces, but that does not necessarily mean they are pushing  specific sites.  Cory is too quick to suggest that when he speaks of hard-coding a rank for site Y.

    Having it both ways can bat down complaints from both sides.  You can tell the people who want a better rank just because they think they are awesome that it’s not your fault, it’s the algorithm’s decision (and, implicitly, their own fault for not being linked enough or whatever.)  

    And you can tell the people who want to assert what a “fair” and “objective” search algorithm would look like (still according to their own interests, mind you) that it’s google’s perogative to choose their own algorithm and you can’t tell them what to do.

    1. I don’t think it can be both – it’s the P vs. NP problem, essentially.

      I think Google are being both obnoxious and self-defeating (over the long term) by obfuscating the functionality of their searches. It’s one thing to count points against specific domains, but entirely another to spice up my results with words and phrases that I did not search for and do not want in my results.

  5. Many years ago I worked for a search engine company. This was ages ago, and back then the big trick sites would use to goose up rankings was to copy an entire dictionary onto the bottom of page and make the text the same color as the background. Thus, a search for any term would, theoretically, turn up that page. The search engine company figured out the hack and began blocking any page that used this strategy from all search results.

    This type of cat and mouse game continues between site owners and the search engines. As soon as someone figures out, say, bolding picture titles moves their site up in the results the search engines stop using that criteria in their algorithms because now it is no longer a legitimate measure of the site’s relevance. The search ranking algorithms have never been neutral and never could be because keeping the algorithm static makes it prone to manipulation. 

  6. I don’t see an appreciable difference between the two approaches Cory outlines.

    In both cases, there is an implicit model being constructed by someone.  In one case, that model is a computer algorithm driven by data.  In the other case, it is the sense of the Editor as to how the ‘magazine’ should be constructed for the reader.

    In neither case will Google be free from the arbitrary influence of external parties (i.e. governments).  And, in fact, they already are –  in China and other totalitarian states who have their own view of ‘reality’.  The RIAA is going to push Google in that direction regardless of how they arrived at their search results.

  7. People want a neutral search engine? First they’ll have to actually define what a neutral search engine would be in real world terms – and how it would be possible to implement.

    I doubt it’s actually possible.

    Everything is biased in some way, according to someone’s perspective and neutrality is, in a large part, a question of perspective.

  8. I’ll give another example from that old job to show how search engine rankings are SO easy to manipulate. 

    One cool feature of the search engine I worked with was it had a thesaurus in it, which could be used to give search results. So, say you are searching for “television”. The thesaurus would tell you “tv” was a strong synonym (+10) and “home electronics” was a weaker synonym (+1). After searches exactly matching your search term of television you would see searches that included “tv” and then way down the search results list would be results with”home electronics”. You could buy custom thesauri, which was useful for, say, a law firm or a hospital, which might want to search over legal terms or medical terms.Now, here’s the tricky bit – you could build a custom thesaurus to manipulate search results We had a customer who sold wine over the internet that used our search engine to add the search function to their website.  Using the custom thesaurus feature, they would force the products with the highest profit margin to the top of the ranking. So, say you search for “Cabernet Sauvignon”, the more profitable Sutter Home might be given a rank of +10, while the less profitable  Turning Leaf is given a +1. Your “neutral” search results pop the Sutter Home up to the top of the list.What appears to be “just the maths” can actually be tweaked using so many features of the search engine to pop a variety of results up to the top or downgrade others. Sometimes this is done in an attempt to replicate how a human would rank the results, which is what Google claims to do, or other times it can be done to reward and punish certain kinds of behavior. But all of the factors that are used to rank the sites are determined by humans.

  9. Google wants to avoid being dragged into contributory copyright infringement the way Napster and Limewire were.  Those services didn’t store the actual infringing material, but they stored references to infringing material.  Napster at least was found to contribute to copyright infringement because of that.  

    At the same time, they’re trying to avoid regulation on the results by casting it as speech.  Google’s trying tread a line between saying “it’s all math, we don’t really control it, so we’re not contributing” and “We disagree with you on the importance of a particular site, it’s our opinion that the other site is better than yours”

  10. I don’t mind SEO-ers doing what they do, or search results more and more being tailored to my search history – I can sign out and let the pages fall where they may — what really worries me –
    as the levers become better and better –  is Google taking this next step and assigning themselves Editors of internet search. Of coarse as long as we have everyone able to upload, and no throttling, over time, the algorithm will reflect realty – no matter what Google, Bing or whoever says/does. 

    On the other hand, I fear the big content producers are going to dominate as content gets richer and richer, by the function that Great content is going to be for the most part, Big content – and purveyors of the big, great content are gong to dominate the web – and the ISP’s are going to have to bow to their demands — very soon I fear.

  11. Big content already dominates the web. How many times have you searched for a product and the first listings that come up are on eBay or Amazon?

  12. This is not a copyright issue — it is a predatory practice or antitrust issue.  In America, the statute would be the Robinson-Patman Act which bars all types of explicit or implicit price discrimination.  Since the DOJ has not been that active in taking on huge Robinson-Patman act cases in the past 30 years, that is why the Euros have developed much stronger laws.  Think Microsoft v the EU rather than RIAA v Pirate Bay (

  13. But isn’t a big part of Google’s issue that they wanted to APPEAR impartial, while actually returning preferential search results?  A big part of the value-add of Google search was supposed to be that it wasn’t for sale.  Editorial placement is always for sale.  Are people just going to conveniently forget this?  Or is Google walking a knife edge where they stand to lose their competitive edge in their attempt to avoid censorship? Google didn’t really say Plato’s cave doesn’t exist, they just said “we don’t really live there, and haven’t for quite some time.”

  14. Another option Google is quietly investigating: Pay to rank… This won’t likely get people fired-up because it relates solely to businesses and products – but it is related and seems ominous and will certainly hurt the small business I work for: Google has recently announced that their  “shopping” results will, starting in the fall, be ranked according to relevancy and bid, i.e. you must pay to be listed and highest bid jumps to the top. They are currently ranked similarly to regular organic results: relevancy, data quality, store (site) reviews / quality. Currently “shopping” data is a combination of google-scrapped content, and submitted feeds (submitting a feed is free). Since the Shopping results are integrated into regular organic search results, this essentially means you will be able to buy organic results. Apparently, even though our company has invested substantial resources into developing our google products feed – we will no longer be able to submit the feed without paying – which also means in our case that through a “bait and swith” google has stolen our good faith development efforts. Ominous and seems to break how search should work…

  15. Slightly off-topic, but there wouldn’t happen to be anybody who knows if there’s a permanent opt-out from Guardian’s facebook app? I noticed it catches URLalts (like too these days.

    EDIT: Figured it out; had to do with cookie settings, not really the app that’s to blame. And, I should add, an enjoyable read as always.

  16. Google have always used human editors (they call them reviewers) as well as algorithmic factors to determine what they deem “relevance”.

    They wade through thousands of search terms and weed out the sites they consider spam or ‘unworthy’. The big G gives clear instructions  (search for “leaked google review document”) to it’s ‘editors’ as to who gets to stay and who gets the boot.

    It’s no news that G constantly manipulates the results ‘by hand’ or ‘hard coding’ as Cory would prefer to describe the process. 

    The RIAA case is irrelevant and common sense will eventually prevail. Flat earth society lawyers chasing crumbs, as they did with the phonograph, the wireless and home taping.

    The real issue is whether Google will keep up it’s motto “Don’t be evil”. A search engine can never be neutral but  gradually, the organic search results are being crowded out by sponsored ads and brands.  Can G be spared from megalomania as it’s monopoly grows?

    One potential saving grace is that the recent Panda and Penguin Google algo updates of 2011/12 place a massively higher weighting on a sites’ engagement with the social web (Facebook, Twitter and Google+ surprisingly…). 

    Social proof  is one of the key metrics in the real world for convincing someone you’re a big-shot. It seems Google are just trying to find a way of measuring this with lots of PHDs and a very big computer.

Comments are closed.