David Dobbs on the importance of open-source science

Right now, most scientific research exists behind paywalls. And expensive paywalls at that. A license to read a single peer-reviewed journal article can set you back $50. Depending on the journal, that number might be a little lower, or a little higher, but access usually doesn't come cheap ... even if the research was funded with public money. When I write about a paper, I usually have to request a copy from the researcher before I can even know whether the paper in question is one I want to write about. And it's not just journalists that get locked out. Even scientists themselves can't always get access to the papers they need to read in order to do their jobs. New science is being stifled by the old business of scientific publishing, argues science journalist David Dobbs.

Open-access journals are different. These publications—the most famous being the Public Library of Science, or PLoS—make all the papers they publish available to anyone online, rather than printing expensive paper copies for subscribers. In a great article at his Neuron Culture blog, Dobbs makes the case for open-access science:

The current system, they note, grew out of meeting notes and journals published by societies in Europe over three centuries ago. Back then, quarterly or monthly volumes could accommodate the flow of ideas and data from most disciplines, and the printed journal, though it required a top-heavy, expensive printing and publishing infrastructure, was the most efficient way to share those ideas.

"But now," says Jonathan Eisen, "there's this thing called the Internet. It changes not just how things can be done but how they should be done."

... To get a sense of how the current system curbs science, consider a rare case in whichresearchers attacked a big medical problem with an open-science model. In 2004, in the United States, a network of government and private researchers, including large drug companies, used open-science principles to accelerate research into Alzheimer's. The project, as Gina Kolata aptly described it in the New York Times last summer, "was an agreement ... not just to raise money, not just to do research on a vast scale, but also to share all the data, making every single finding immediately available to anyone with a computer anywhere in the world. Before that, researchers worked separately, siloing off much of their work. Now methods and data formats were standardized. The data would immediately enter the public domain, where anyone could build on it."

An extraordinary project ensued. The U.S. National Institute on Aging contributed over $40 million, and 20 companies and two nonprofit groups kicked in another $25 million to fund the first six years. The program produced an explosion of papers on early diagnosis and helped generate more than 100 studies to test drugs or other treatments. It greatly sped and opened the flow of findings and data. According to the New York Times, the project's entire massive database had been downloaded more than 3,200 times by last summer, and the data sets containing images of brain scans was downloaded almost a million times. Everyone was so pleased with the results that they renewed the accord this year. And all because, as a researcher told Kolata, "we parked our egos and intellectual-property noses at the door."


  1. Every piece of research paid for by the US governement is available in pubmed central for free—however it usually takes 1 year to get there because it would ruin the publishing industry if it didn’t. Not every scientist can afford the extra charges to pay for immediate public access of their work—none of this is free, unfortunately you need to pay editors etc to keep the work clean enough. I can see why this would be a problem for a journalist but scientists can get anything that was written here within a year, it would be nice if other countries followed suit.

    1. Actually that isn’t quite true. The PubMed Central agreement only applies to research funded by the NIH — only one of the many US funding agencies. Research funded by the NSF, the DOE, the USDA, etc. currently has no such requirements. While you can find research supported by the agencies in PubMed Central, that is only because the journal itself is agreeing to release it out of good will (or if the journal is open access to start with).

      I think the NIH/PMC agreement is good; but it doesn’t cover the majority of US funded research, not even the majority of US biology research.

  2. These paywalls are a drag for science/technical translators as well. Elsevier, for example, blocks access to articles I need to clarify concepts I don’t understand or verify names or numbers from blurry PDF copy. When the publisher bounces you off the paywall, you have to figure out how to extract enough useful information from Google’s previews. Sometimes you can’t.

    Thank goodness so many patents are available online. That significantly increases the usefulness of my translation output.

  3. When a paper is presented at a conference, unless a pre-print is also distributed, less than 25% of the paper is treated in the time window given the speaker. Then as much as a year can pass before the entire paper appears in that societies journal.
    I spent 20 years button-holing speakers and exchanging business cards to get on a first name basis with those steering a technology change. It worked for me but it wasn’t easy.
    Having an open forum for publishing and a wiki for discussion speeds things along. It’s amazing what one person can jar out of the brain of another.

  4. I totally agree that the whole journal subscription thing was once useful but has now become an anachronism. However, as a researcher, the fees for publishing in open-access journals also seem somewhat extortionate – it costs hundreds of pounds to publish one twenty-page online paper. Usually, of course, that money comes from the research grant and not from the scientists’ pockets, but it still seems like a lot.

    Personally, I don’t really see why money has to be involved at all, except to keep the servers running. The latter isn’t a huge cost and it would be very worthwhile for governments or universities to fund it. The people who review the papers are not paid anyway, and we scientists have been typesetting our own papers for conference proceedings for many decades. The only person who needs to be paid at all in the current system is the editor, who decides who gets to review the papers, and makes the final decision on publication. But it surely wouldn’t take much thought to find a workable way to crowd-source that job. Then the whole publishing company could be replaced by a publicly-funded website – something like arXiv.org but with a rigorous peer-review procedure – and no money would have to change hands at all, either when downloading or when publishing a paper. Only then will science be truly democratised.

    1. we scientists have been typesetting our own papers for conference proceedings for many decades

      No. *Computer* Scientists (and theoretical physicists, who practically are computer scientists these days) have been typesetting their own papers. Not scientists in general. In the world of biology, at least, Microsoft Word is still king. As a biologist who is a UNIX geek (and who even formatted his dissertation in LaTeX), I lament this fact, but it is just not realistic to expect non-computational people to learn what amounts to a programming language to typeset their own papers. So biology journals (even open access ones) still need paid typesetters.

      1. It’s even dumber than that. I’m an astrophysicist, and we LaTeX practically everything. But it’s mostly a waste of time. We even make it more of a waste by having an expectation that peoples’ arXiv submissions have the “look and feel” of the journals. Yes, this is what I looked forward to as a kid thinking about being a scientist: Hours of “but I put in [htbp] why isn’t my float what is this even?” (shouldn’t complain though…).

        The last paper I submitted was a code description paper, on an open source data analysis application I developed (getting permission from my institution to release open-source code was no mean feat either, talk about obstacles to science). In the text I had some example command lines, with “–argument.” This was fine in my LaTeX version, but the copy editor told me “the software can’t do two dashes like that.” Wahuh? Oh yes. They use entirely other, probably really expensive and old and weird software. They probably type it in by hand or something.

        So all the LaTeXing? Yep. Busy work.

  5. Costing publication in open source journals into your grants (the big Research Council ones anyway, if not from charities) is becoming pretty standard here in the UK. My research support office whacked it in my last grant application as part of the costing process. So the idea is certainly spreading already.

    1. I agree that the publication fees seem expensive, but I think you underestimate the costs of keeping an online journal going. For a large journal like PLoS One, there are dozens of people involved behind the scenes keeping the editorial process going :


      Also, “keeping the servers going” is not trivial either. There are no web apps for running a peer-review journal; all the software is custom, and must be maintained and upgraded. (An open-source journal management system would be an invaluable project, by the way.) Getting a paper through the review process requires perhaps a hundred separate communications, each of which must handled somehow. And then there is the cost of maintaining the archive of all those communications, the paper itself and its supporting materials, in perpetuity through unknown future software upgrades.

      Could it be done for less? Possibly. But probably not a lot less.

      Consider also that a four month project involving three researchers costs the funding agency at least $100,000. If you’re looking for something to get upset about, start with the 40-60% of the money that got sucked into overhead, not the 1% spent on publishing fees.

  6. It is silly, even as an undergrad I find it pretty frustrating that if I’m not on university grounds it’s nearly impossible to do research. All the online journal repositories are paid for by the uni so any non-uni internet connection is out-of-bounds.

    We, as a society, should be beyond locking off knowledge based on how much you can pay for it.

    1. Until open access becomes the norm, see if your university offers VPN (Virtual Private Network) services. They likely do. That will allow you to connect remotely to the university’s network from any computer you own. This will make it look to an outside observer (like a journal publisher’s website) like you are coming from the university network, so taking advantage of your university’s access should not longer be a problem.

  7. Even for students at relatively prestigious universities (hi!) barely a month goes by where I don’t find a paper relevant to my work that I don’t have access to.

  8. i’m suddenly grateful that my university has excellent access to pay-wall journals
    grateful until i remember how much tuition i pay, that is

  9. In physics and math, a massively important role is played by the arXiv: http://arxiv.org . You write your paper, you upload a copy to the arXiv (for free), then anyone anywhere can read it (for free). Then you submit it to a journal, who will perhaps eventually publish it. But the published version will probably be behind a paywall, so many people will read the arXiv version.

    The arXiv’s got so important now that no publisher seems to dare to deny you the right to put a copy there. This is a great thing, and I feel sorry for people in subjects where free online copies are not the norm.

    It should also be said that “open access” doesn’t necessarily mean that the author pays. Some of the smaller mathematics journals are free to both author and reader.

  10. Stallman {who, as we all remember, rejects the term “open source”} talked about this *long* ago. In an article titled “Science must ‘push copyright aside.'” It’s here: http://bit.ly/cCdS2u

  11. Two other issues:
    1) For those without grant support, the publication fees can prevent dissemination of results.

    2) There is no motivation on the part of publishers to allow access forever. Many sites allow free access to the past 5-10 years–but older material is locked away. And even if it is still available–for how long? When new methodologies arise that allow old, abandoned questions to be addressed more effectively, we often cannot get online access to the early/original research. Some people assume if its old, its irrelevant, but I’ve avoided repeating mistakes by reading the old lit. Fortunately, my library has old paper journals still–although there is increasing pressure to discard them.
    Also, there often seems to be some older scientist reviewing articles who is upset if you don’t cite the classic research.

  12. A long comment, but this is a big area! Apologies to the moderator if this is considered too long for the comments section.

    This is a large area of development right now and has been growing across disciplines for about 10 years (http://www.soros.org/openaccess), although the roots go back further than that. For publications and articles, this is generally called “Open Access” by practitioners, rather than open science, which acts more as an umbrella term containing open access, open data, open peer review etc.

    In addition to the general approach of this piece and most of the comments, there is another approach to open access: by using repositories to store digital duplicates of published material.

    arXiv, was mentioned, and is the pre-eminent example of this approach as a *subject-based* repository, but there is an infrastructure of *institutionally-based* repositories that allow all disciplines to archive their work. As the repositories are cross-searched either through general searches like Google, or through the metadata protocol used in the area – OAI-PMH – it makes no difference to the searcher whether the repository is subject based or institutionally based. Hosting them at an institution helps cover the costs of running them, which one commenter noted as an issue for arXiv.

    What/where are these open access repositories? The Centre for Research Communications at the University of Nottingham, where I work, runs a global registry called the Directory of Open Access Repositories (OpenDOAR): there are currently 1,960 of them. http://www.opendoar.org/find.php
    We do a breakdown by various stats and a Google map where you can easily see that this is now a global phenomena. http://bit.ly/m5b48C

    Open Access publishing has been the subject of intense debate, and is growing fast. The Directory of Open Access Journals (DOAJ) based in Lund University, Sweden, currently lists 6,518 journals. http://www.doaj.org/

    The two approaches, OA publishing and OA repositories, are complementary and work together. Both approaches have advocates: what is needed is open access to research.

    Research funders, like the NIH, are starting to realise the benefits it gives to their funded research – wider readership and re-use – and are mandating open access for the grants they issue. Politicians and policy makers are seeing the cost savings of OA – a report showed annual savings and benefits of £270M just in the UK in a switch to systemic open access. http://www.jisc.ac.uk/publications/reports/2009/economicpublishingmodelsfinalreport.aspx

    There are direct benefits for academic authors, in that OA means their work is more widely read, there are new collaborative or comparative opportunities and citations are increased. There are benefits for journalists in that they get direct access to the latest and best research, rather than easy access to self-promotionists and fad-advocates and restricted access to the provable stuff. And of course, the public get access to the results of their tax-investment: publicly funded research should be publicly available.

    If you are interested, we run a blog which lists some further references to OA issues as a starting point for onward exploration: http://rcsproject.wordpress.com/oa-answers/

  13. As the author of the piece linked to here, I’m very glad to see this discussion going at Boing Boing — it’s this sort of exchange, and these sort of issues, that need to be engaged to move sci publishing into a more accessible and fluid form.

    In the good news department, Jonathan Eisen reported at his blog today that he has finally succeeded in getting almost all his dad’s publications listed and made available (in PDF form) at the page he built for his dad at Mendeley. Links here for

    Jonathan’s post on almost completing the project: http://phylogenomics.blogspot.com/2011/05/freeing-my-fathers-publications-part-5.html

    His dad’s web page at Mendeley, with all the papers: http://www.mendeley.com/profiles/howard-eisen/

    A post of my own this morning giving him a nod for getting this done:

    The original article Maggie discusses is at


    David Dobbs

Comments are closed.