Features Podcasts Family Video Comics Music Tech Science Books Film & TV Games ✚

Jill

Search algorithms are editorial decisions

Cory Doctorow at 6:32 am Tue, Jun 2, 2009

— FEATURED —

THE LATEST

Guatemala: Nation's highest court throws out Ríos Montt genocide trial verdict and prison sentence

Feature

Eurovision 2013: An American in London

Book Review

The Twelve-Fingered Boy - mesmerizing YA horror novel

Book Review

Black Code: how spies, cops and crims are making cyberspace unfit for human habitation

— FOLLOW US —

Boing Boing is on Twitter and Facebook. Subscribe to our RSS feed or daily email.

 

— POLICIES —

Except where indicated, Boing Boing is licensed under a Creative Commons License permitting non-commercial sharing with attribution

 

— FONTS —

Tweet
Kindle
In my latest Guardian column, "Search is too important to leave to one company - even Google," I make the case that Google's algorithms are editorial decisions, and that so much editorial power is better vested in big, transparent, public entities than a few giant private concerns:
It's a terrible idea to vest this much power with one company, even one as fun, user-centered and technologically excellent as Google. It's too much power for a handful of companies to wield.

The question of what we can and can't see when we go hunting for answers demands a transparent, participatory solution. There's no dictator benevolent enough to entrust with the power to determine our political, commercial, social and ideological agenda. This is one for The People.

Put that way, it's obvious: if search engines set the public agenda, they should be public. What's not obvious is how to make such a thing.

Search is too important to leave to one company - even Google

I write books. My latest is a YA science fiction novel called Homeland (it's the sequel to Little Brother). More books: Rapture of the Nerds (a novel, with Charlie Stross); With a Little Help (short stories); and The Great Big Beautiful Tomorrow (novella and nonfic). I speak all over the place and I tweet and tumble, too.

More at Boing Boing

Eurovision 2013: An American in London

The technology that links taxonomy and Star Trek

  • ral8158

    Seriously, Cory?
    You don’t have bigger fish to fry than the opt-in evil provided by Google?
    No societal change needs to take place to eliminate the issue you see. If you have an issue with your own privacy, you can use Tor. If you have an issue with search engine editorializing, use a different one.

  • Anonymous

    anon @43,

    Interesting. What regulatory framework *could* governments implement to limit growth? I can’t think of anything off the top of my head.

  • indiebass

    Thank goodness we’ve finally got Bing!

  • mdh

    And search engines could properly compete with each other and evolve towards situations more in keeping with what their users truly wanted.

    And here I was thinking search engines did what their CREATORS truly wanted them to do.

    When the internet can read our minds, we will be redundant.

  • Umbriel

    Attempting to make anything “public” by making it the responsibility of any governmental entity, _is_ the equivalent of entrusting it to a (hopefully) benevolent dictator. Choice is the best mechanism for ensuring public freedom, and the best way to ensure it is to limit or eliminate governmental interference (especially the allegedly benevolent kind, as in various types of corporate welfare) in the process.

    However huge Google may be, as long as nothing other than public choice acts to stamp out alternatives, it’s not being oppressive.

  • toyg

    Looking at policy trends, I don’t think I’d trust government with that power either, or any public body really.

    As soon as some “tough on abortion, tough on the causes of abortion” nutjob gets in, poof, lots of info on family planning and contraceptives would magically disappear. And once you have a state-sanctioned player, all the others can automatically be blocked by school networks, libraries etc, making them pretty much irrelevant (or worse).

    If you don’t like Google, by all mean build an open-source replacement… but stay clear of government for anything, especially for funding.

    The road to hell is paved with good intentions, this is one of them.

  • JL Bryan

    Cory, love your writing, but I have to agree with the other posters–government should not control search engines! It was “the People” who chose Google over ventures like AltaVista. If Google ever dissatisfies, we’ll move to other search engines, maybe one that doesn’t even exist yet.

    If you’re suggesting an open-source noncommercial search engine, similar to Firefox or Wikipedia, then full steam ahead! But if you’re saying we need a Ministry of Truth controlling internet search–uh, no.

    State control is a poor substitute for freedom when it comes to information. Democracy is no good here, either; why should the majority control what the minority sees?

  • snsr

    Entrusting the “public agenda” to a private company is foolhardy, but why on earth would we want a government agency in charge of indexing and selecting information?! The idea of a government agency, no matter how “public” it is, setting my agenda any more than they already do is sickening. That’s *BIG BROTHER MOTHERFUC8R*

    Google won’t last forever, and whomever replaces them will be clever and also opaque. No organization, public or private, will ever wield that power kindly and selflessly.

    A more appropriate role for government would be creating some rules about transparency and neutrality.

  • z7q2

    Um, no.

    Take off the Google goggles and try other search engines, and search engine aggregators like Dogpile. Or build your own, as has been suggested. But don’t try and herd the search cats, it will annoy them and ultimately you.

    I think one of the flaws in the argument pitched is that search rank makes or breaks companies. It does, but if you were naive enough to build your business plan around search rank, you deserve to be broken.

  • turkshead

    The government needn’t be involved for there to be a public editorial mechanism. Creating an open algorithm-specification standard and making it easily pluggable into private infrastructure would do the trick; then you create a nice little “doctrow-compliant search” logo you can display if your engine is using the public algorithm.

    As ever with standards, getting buy-in is the hard part.

  • Anonymous

    Google is a public company.

  • Cory Doctorow

    Gang, the word “government” appears nowhere in the article.

  • airship

    I’m a lifelong liberal, but in my opinion this is one of the rare areas where unrestricted free enterprise really does work the best. When somebody else does it better, Google’s market share will drop.

  • shadowfirebird

    Cory, presumably your concerns would be removed if Google published their search and page ranking algorythms so we could see how they worked?

    That makes perfect sense to me.

    OTOH, I don’t know any “big transparent public entities” that have enough money to run a google-size server farm. That solution sounds not so good.

  • Anonymous

    the public is too irresponsible and untrustworthy to be given control of anything valuable. That’s just the public in general. While there is a small group of people who would take the time to understand a public algorithm and the way search engines work, I would still be afraid to give them any control as people are generally not trustworthy. The masses are a totally different story. The masses wouldn’t take the time to understand it and would still attempt to control it [just like they do with current politics and anything else of broad public value].

    -Chris Allison

  • NE2d

    Gang, the word “government” appears nowhere in the article.

    When someone says “The People,” that’s usually code for “The Government.” You also mention “public entities” as opposed to “private concerns. What exactly do you have in mind?

  • Billegible

    Having worked in search – yes, algos are “editorial”, in that you have to figure out what is useful and what is not useful and create algos to push the useful up and the non-useful down.
    However, the very “editorial” nature of the algos is what makes search work. A search engine that gives us dross in the top 5 is not considered a good search engine. No one s going to look through all the pages of a search – the nature of the algo is to try to get you what you are looking for in the top 10 and preferably top 3 results.
    Search algos try to dampen out spam, old results (2001 report on X vs the 2008 report on x), results in which your search terms appear in lesser proximity, etc.
    The reason google has the stranglehold on the market that it does is that its algos are currently the ones giving us the most useful top 3/top 10 results for our queries. “editorial” decisions made by the people who write and test these algos is what makes that happen. Without these decisions as to what is useful and what is not, a search engine is useless – and trust me, I’ve tested the output of a newly born search engine and I have a pretty good idea of useless.

  • Billegible

    @ 10 shadowfirebird presumably your concerns would be removed if Google published their search and page ranking algorythms so we could see how they worked?

    The reason they don’t do this is to prevent spammers gaming the system. They keep it secret so they can keep the spam out of the top 10 as much as possible to give you the results you want, not the results the spammers want to feed you.

  • AceJohnny

    So basically, people, you’d rather trust a capitalist corporate entity (which some argue exhibit psychopathic behavior), rather than your own democratically controlled government.

    I’m not saying I disagree, sadly, but wow, what a admittance of the failure of democracy…

  • Rindan

    So basically, people, you’d rather trust a capitalist corporate entity (which some argue exhibit psychopathic behavior), rather than your own democratically controlled government.

    I’m not saying I disagree, sadly, but wow, what a admittance of the failure of democracy…

    I don’t trust A capitalist entity over my government, but I do trust a pile of them who are busy slitting each other throats over my government. There is a fundamental difference between a democratic government and a functioning market. I don’t choose my government, but, if the market is working properly, I do choose my corporation.

    Democratically elected governments don’t reflect the best choice for you. It reflects the best choice as judged by the majority after some sort of weeding process. You can rest assure that when Bush was elected… TWICE, that did not represent my choice. I don’t think that just because he did (or did not) get a majority vote that he must be the best guy to lead. We let him lead because democracy is a game used to pick a leader that the people with guns agreed to play. We let him continue to lead because we would rather let a game determine the leader and have everyone just accept it rather than breaking out into a civil war every time two presidents/PMs/whatever split nearly 50/50 on who would make the best el presidante. If I thought that Bush or Obama were horrible rulers I don’t want to live under, my two choices would be a futile rebellion likely to leave me dead or in jail, or leave the country.

    Now, in a functioning market, things are much different. I might think that Google is horrible and a super majority of the planet could disagree with me. Yet thankfully, their opinion doesn’t matter. I can ignore the ‘democratic’ opinion of everyone else and take my business to Yahoo. It is like if after Bush won, I was like “fuck that, Kerry is my president” and I just lived under his rule while everyone else went on living under Bush. In a functioning market your every decision is a choice among many options, with all the other idiots not being able to coerce you into following them. In a perfect democracy on the other hand, it just takes 51% of the people to think your opinion is wrong, and the other 49% have to live with it.

    I am not saying that in practice markets always work out shinny, and I am not suggesting that government is always evil. I am saying that when you have a functioning market, one where there is lots of choice, it is easy to switch companies, and one company isn’t crippling the others unfairly, a market beats the piss out of democracy.

    The search engine market is a great example of a place where a functioning market (which we have) rocks the shit out of democracy. There is lots of competition, changing search engines is free, utterly without hurdles, and none of the companies are doing anything to sabotage the other other than to build better search engines. If you think Google’s editorial review sucks… spend 2 seconds typing in another search engine into your web browser and go somewhere.

    There is place where rule by majority makes sense. Search engines that control my information pipe is not one of them. Rule by individual choice is clearly the better option here.

  • Jerril

    When someone says “The People,” that’s usually code for “The Government.”

    … HUH? Is this some sort of weird American quirk? Whenever I see “the People” it’s almost always directly in CONTRAST with “The Government”. It’s often either an anarchist or a libertarian rallying cry, but collectivism tends to use it too.

    “We, the People” doesn’t mean “the government” – it means the people who elected the government.

    You also mention “public entities” as opposed to “private concerns”. What exactly do you have in mind?

    I would expect (from Cory) something like not-for-profit organizations, open-source swarms, or the like.

  • angusm

    Jeff Atwood has a good post on the subject of the impact of a Google monoculture:

    http://www.codinghorror.com/blog/archives/001224.html

    concluding that “… if Google, for whatever reason, decided to remove you from its search results, your website no longer exists.”

    I’d love to see a distributed search engine, but the “if you build it, they will abuse it” rule means that from Day One such a thing would be under massive assault from spammers. Making it resilient enough to withstand this would be a real trick, and saying that ‘peer review’ will take care of the problem is overoptimistic when most of the ‘peers’ contributing to the index are likely to be infected Windows PCs controlled by a botmaster in Moscow.

    Ultimately, it comes down to trust. I trust Google (to some degree) because their interests are at least partly aligned with my own: their interest lies in delivering useful and ‘clean’ results so that I’ll keep using their site. A distributed search engine I can trust must also derive its data and behavior from people who don’t have a vested interest in lying to me. That starts to sound like a call for a web-of-trust approach. I don’t trust a million strangers, but I trust my friends and, to a lesser extent, their friends, and so on.

  • JL Bryan

    “So basically, people, you’d rather trust a capitalist corporate entity (which some argue exhibit psychopathic behavior), rather than your own democratically controlled government.”

    Yes. Google would go bankrupt if people chose to stop using it. When our “democratically controlled government” is told clearly to do things like stop the war by a majority of voting citizens (as in 2006, 2008)…nothing happens.

    We don’t have the choice to withdraw our own money from the war–I’ve never supported it but have paid thousands of dollars into it. We do have a choice about where our money goes in the market.

    I would argue that the market, not the state, reflects the will of the people. The state reflects the will of whoever gives the most money to politicians. Market institutions (including voluntary nonprofits) must serve their customers/donors or disappear.

    Democracy is way, way overrated. In America, half the people don’t believe in evolution and are waiting for the Second Coming. Yeah, let’s base public policy on that.

  • bwcbwc

    Gee, Cory. With all your posts about the erosion of civil liberties and surveillance, you’re still willing to give the government the opportunity to monitor/log/trace all of our searches?

    I have to think you have something in mind a non-profit organization funded by international NGOs with a peer-reviewed and publicly available search algorithm.

    Or perhaps an open-source project funded by a foundation?

    But please, please tell me you don’t mean “public” in the sense of government owned, controlled and monitored. I think I’d rather share my search history with advertisers.

  • JL Bryan

    Cory,

    I guess it wasn’t clear just what you were suggesting as an alternative. “Public control” is nearly always means government control.

  • yannish

    The website owner and others on the web have a lot of say on what ranks well on Google, demonstrated by the emphasis on backlinks and the importance it plays in determining Page Rank. The whole SEO industry is based on this premise.

  • Anonymous

    The start of a great discussion concerning the shortcomings of democracy.
    And we could also talk about some of the shortcomings of capitalism or free markets or whatever misleading euphemism is in vogue. The first of those that I want to discuss is the lack of freedom in these markets. Monopoly is always the unstated aim of the self-proclaimed free-marketeers. And what is worse, they have the government running interference for them.
    If on the other hand , the governments actually regulated the markets, and arranged it so that increasing power led to diminished opportunities, rather than a vicious cycle headed for monopoly…Then we might have a capitalism I could subscribe to. And search engines could properly compete with each other and evolve towards situations more in keeping with what their users truly wanted.
    Trust. It’s amazing that it still exists at all.

  • airshowfan

    As much as I love the idea of an open-source web-search engine in the spirit of Firefox and Ubuntu, I think that keeping spammers (or “spamdexers” or whatever) from reaching the top is one endeavor where “security through obscurity” might actually have a place.

    Like many people here (and not unlike Cory), I am a liberal who thinks that the free market will probably continue to do a good job bringing us useful search engines.

    But as Clay Shirky would remind us, a search engine can be built collaboratively “out of love” rather than for profit (This would arguably make it even better), and then compete with the older for-profit search engines (as has happened with operating systems, online encyclopediae, text-editing and image-editing software, etc).

  • Anonymous

    Surely the public already votes for search algorithms every day in the only way that is truly impartial – by using the one that gives them the results they want?

    Being angry or afraid because Google now “wields too much power” makes no sense – we voted for them yesterday, and we’ll vote for them again tomorrow.

  • DWittSF

    What’s the problem here? It sounds to me like it’s people who are too lazy to do anything more than click the top link or two. The real solution all along is to define better keyword searches and dig deeper into the search result pages…which uses the individual algorithms we all develop in our own heads.

  • kmoser

    Too bad Google doesn’t have any competition–oh, right, they’re called Yahoo! Or maybe we need an open directory of sorts–oh, wait, there’s already dmoz.org.

  • Anonymous

    Cory, surely you can see that Boing Boing’s power to declare things cool and interesting is far too dangerous to be wielded by any one man, even one as benevolent as yourself?

    In fact, every aggregator who makes themselves useful by consistently selecting for relevance and quality has this dread power.

    I call for public voting on every blog post everywhere, before they can be published. Of course, since that would mean publishing them so that everyone could read and discuss them…

    I call for the internet!

  • Bugs

    Wasn’t Wikia exactly what you’re talking about? It was an attempt at an open-source search engine with the search algorithms, ranking algorithms, etc. fully open and editable.

    Unfortunately, hardly anyone ever used it (Wikipedia reports it had a market share of 0.000079%!) and it shut down within the last few weeks. The latest version of the source code is still online and I’m sure the remaining community would be happy to help if Cory wants to pursue his vision.

  • Lobster

    I’m not really sure you understand how Google works. Pages are displayed based on number of appearances in links. So it’s almost like the pages you see first are the ones most recognized to reflect what you’re searching for.

  • mralistair

    whether ‘public’ means ‘government’ is a moot point.

    Isn’t a more fundamental problem that any open algorithm will be open to spammers and scammers and so be gamed constantly.

    Google keep their algorithms top secret for this very reason (and the competition but they’ve caught up pretty fast in terms or actual search results)

  • hep cat

    Cory

    The gist of this seems to be that the search algorithm should be known and open, which gives control of where in the search results a web page (or anything else on the internet) appears to the owner of the web page. Of course everyone would like their page to be at the top of relevant searches , and spammers would like to be at the top of ALL results. This is sort of thing leads to the proliferation of aaa1 locksmiths in the yellow pages.

    The only way that would work is if there were a way of imposing standards for searchability on all of the webpages with some sort of enforceable penalty for non compliance. An example of non compliance would be making Viagra sales site show up in searches for powerboats.

    I can’t think of any system where any entity published a standard for searchability that would not require that the same entity had editorial control of everything that showed up in the results.

    I don’t like that Google and archive .org follow the robots.txt when they are abused by government agencies and domain squatters.

    If you can think of a way for this idea to work without the same body that publishes the standards enforcing those standards on web publishers I’d like tho hear it. For extra credit, since this implies some sort of contract or regulation, who would hear appeals?

    Who does all of this and who chooses them is a whole other discussion, I just don’t think you can standardize search results without standardizing content.

  • Anonymous

    i think you could ‘weight’ returns that come from Wiki or other publicly built repertoires of knowledge, after all, isn’t that how the google ads works? It rates the relevance of the ad to the search criteria – so wiki (and similar) entries could be weighted more heavily as they have more relevance and/or are more publicly accessible. Cheers, Con O’Donnell

  • anansi133

    My knee-jerk objection to comparing search engines to wikipedia, is that most people have had some experience with paper encyclopediae, it’s not a huge leap to go from that to an on-line version. Any democratic endeavor is limited by the people’s understanding of what they’re doing.

    On the other hand, I have worked long and hard to build the bookmark list that I currently have, and if it meant benefiting from other people’s bookmark files, I’d be happy to share access to my own.

    I already use wikipedia over google for searches most likely to be crowded out with commercial hits.

    To succeed with this, you wouldn’t just out-compete google, you’d also beat craigslist and freecycle. A publicly maintained search engine would become a metacommunity all its own.

    I wonder if P2P networks like bittorrent could be leveraged to handle the bandwidth demands? If you could use clock cycles belonging to the users themselves, you might not need these huge data centers that google employs.

    The easiest way for me to think about testing this idea, would be a firefox plug-in that follows my browsing history and shares the results on the network. When I want to do private stuff that I don’t want to share with others, I turn off the plug-in. When I’m not on line, the same utility could be running in the background like a screensaver, crunching numbers for the other users.

    This is a great idea. The business model as it stands today is not sustainable: running ads to pay for a search engine is a conflict of interest, no matter how you soft-pedal it.

  • Anonymous

    I’ve thought this for a long time. Good to hear you asserting this Cory. I think there’s great value in attempting to decentralize this extraordinary power/control.

  • Anonymous

    Exactly. And all you IT weenies out there, like me, make your voices heard! opensourceopensourceopensource

    thanks boingboing, cory et al for keeping these ideas in the zeitgeist

  • PJDK

    I’m not sure to what degree Google really does set our agenda. Can google really destroy you by removing you from search? It would make starting a new business tricky, certainly, but anyone already established would still be visited, and all hell would be raised with Google.

    And to what extent do they even set a social agenda, speaking personally at least I get my news from a fixed set of sites whose opinions I value. While occasionally I like to dig deeper with google, that’s pretty rare and even then I’ll have something reasonably specific in mind (what do katmandu’s newspapers make of this story?). When you use google you tend to know what you are looking for, so google either shows you it or it doesn’t. It is a library not a newspaper.

    It is conceivable that they could abuse their influence, but I’m not sure what any gain would be for them, what is more the losses of being found out would be enormous.

  • Anonymous

    Pardon me, but since Google is a public company, doesn’t that mean their decisions must be in the best interest of the public. They make money by getting people to use their service. If people stop using their service because it is ineffective or offensive, then they lose out on revenue.

  • greygrey

    Thank you for this topic; social action – and its cohesion – depend for their strength partly upon seeing the less-than-strictly-social forums / vectors out there. To counter the older-style forums, we need to cook up new forums in the real world for more of our natural social tendencies to ‘open with’.

  • retrojoe

    Search is not a public agenda, it’s a private one. It was invented by private companies, it searches private (and public) websites and it is something we (in the free) world choose to use. I actually work for a major Search Engine and I can tell you that we are not the manipulators of the internet that people seem to see us as.

    1) Search engines are just complex, more complete version of the white and yellow pages. You need information and so you scan through pages looking for data. But somehow the order of that data was chosen. And if you think it’s 100% alphabetical you are wrong.

    2) Any algorithm, regardless of its complexity or who writes it, is inherently biased. We cannot escape that.

    3) This is not a free service we provide. We have to make money somehow and no one will pay for search. Advertising makes things even more biased.

    4) Algorithms can be changed at the drop of a hat. And different searches can run on different algos. We can show you how we came up with results but it’s kind of pointless. the next day it may be different.

    Your best bet? Use a multitude of search engines. Let competition drive the market and diversity in search.

  • Stephen

    This is like when people didn’t like the way groups where created in Usenet. The answer is create an alternative. If people like it, they will use it. I find Wikipedia useful, but I would never use it as a search engine because it is often profoundly biased on specific topics. For example, the page on IQ promotes racist stories about skin color and intelligence that where long ago proven to be intentional fraud. As an encyclopedia, I use it as one source among many and compare sometimes read the discussion page to figure out what’s going on. A search engine with this same failure would be much less useful.

  • mdh

    How is this different from your audience wanting input to your editorial decisions?

    The answerto the problem is not the people making the gov’t make Google do something.

    The answer is the people, doing ti better than Google, ourselves.

  • DWittSF

    Ironically, BoingBoing uses Google for its internal search. Why is that?