Test, Learn, Adapt: using randomized trials to improve government policy

"Test, Learn, Adapt" is a new white paper documenting the ultimate in evidence-based-policy: government policies that are improved through randomized trials. It's co-authored by Laura Haynes, Owain Service, Ben Goldacre and David Torgerson. Ben Goldacre elaborates:

We also address – and demolish – the spurious objections that people often raise against doing trials of policy (like: “surely it’s unfair to withold a new intervention from half the people in your trial?”).

Trials are widely used in medicine, in business, in international development, and even in web design. The barriers to using them in UK policy are more cultural than practical, and this document will, I hope, be a small part of a bigger battle to get better evidence into government.

More than that, the paper describes several fun examples of trials that have been conducted in UK government over just the past year, reporting both positive and negative findings. The tide is turning, and there are lots of smart people in the civil service.

Anyway, I think (I hope!) that the paper is readable and straightforward, like the Ladybird Book of Randomised Policy Trials, and I really hope you’ll enjoy reading it.

It’s free to download here.

Here’s a Cabinet Office paper I co-authored about Randomised Trials of Government Policies


  1. I suggest the US should try this by first dividing the country into 50 regions.  Depending on the policy being studied, some regions will all adopt a single common policy, while others try an alternate common policy.  At other times, each region will test a different variation of a single policy.

    Periodically, each region should also peer review the results of other regions.  If another region is achieving better results with a different policy, a region can choose to modify their policy to match the more effective policy.

    Subregions within each region should also be encouraged to try new policies that are compatible with regional policies.

    Just a thought.  I’m pretty sure it can be done.

      1. In primate testing, a few of the monkeys are going to end up flinging poo.  Hard to be a poo flinger without being covered in it.

    1. Sounds like amazing poor experimental design to me.  If you keep the same regions, yet are running multiple experiments on divergent populations, your results will be so confounded that they will be  utterly meaningless.  

      The key point is to randomize across a population. 

      Imagine you want to implement a new teaching program for schools.  What will you have learned when you implement it in Massachusetts and don’t do it in Alabama?  You will learn nothing.  You can’t compare the two because Massachusetts has a vastly higher education level in the population and tend to be richer.  You can’t apply the learning between the two populations.

      Hell, you can’t even tell if it worked in Massachusetts.  If the students’ scores go down, is it because of the new program, or is it because the recession is causing more students to get a job during the school year and because funding to schools is being reduced?

      If you randomize, you will actually be able to pull apart if your program worked.  You will be comparing two equal populations that had the same average conditions, except for the change you made.

      Sadly, politicians are too fucking stupid to see the value of running randomized controls.  They would rather blindly fling programs into the world, trusting their “feelings” that it is totally better than the alternative. 

      1. I pretty much agree with everything you said.  My point (made somewhat sarcastically) was that trying out policies on select groups isn’t a new idea.   The logical conclusion, as you pointed out, is that the test method hasn’t always been implemented rationally or even very well.

  2. I tried several months ago to read “Bad Science” by Ben Goldacre but had to put it down.  His ego and ad hominem attacks on those with different views really did a disservice to his argument about the objectiveness of science.  His book was anything but objective.  The book cover should have been a tip-0ff as to the contents and lack of depth.  FWIW, I’m a scientist and did not stay at the Holiday Inn!

  3. Ah…the UK, that explains it.

    Reminds me of an article about US health care I just read in The Economist: 


    The pertinent bit:  “To suggest curbing an American’s health care is like threatening to kidnap his child. More care, he believes, must be better care. On Mr Obama’s watch new attempts have been made to weigh evidence for different treatments. But the notion that evidence might be used to limit care remains heretical.
    To an outsider, it might seem helpful to know which services are worthwhile. America, after all, spends 18% of its GDP on health, far more than other rich countries. About one-third of that spending is waste. But America has a unique distaste for evidence.”

    Indeed: just because you prove something doesn’t make it right.

    And really, what good is ‘evidence’ compared to moxie, gut feeling and Manifest Freaking Destiny?!

  4. Ummm . . . randomized trials are not the only way to conduct a scientific investigation. They aren’t even the most common way. In fact, they’re pretty much limited to testing new medical treatments. Medicine may or may not be the best scientific model for public policy, but you kind of need to prove that it is before explaining how to do it. It is also difficult to see how to pull this off without forbidding some pretty fundamental rights, like being able to move to a new town if you want.

    1. Truly, social scientist and economist have found “other ways”.  They suck, and they are done because they have no choice.  Randomized trials are almost always the best, if perhaps not always the most practical way to determine if something is working.

      Social sciences are cast off into the “soft science” field because they struggle so mightily to show the effect of anything in a convincing manner.  I can pretty clearly articulate how dropping some phosphorous atoms into a silicon structure is going to change its electrical properties and be almost perfectly spot on.  Social scientist are more or less helpless in predicting the effects of social policy.

      You don’t have to give up fundamental rights.  You can still let people move around all they want.  Randomized trials are perfectly able to handle participants that drop out or change.  Besides, if people are fleeing a town to get to another place with different rules, I would say that chances are high that your new rules suck.  

      1.  Do you think that the same research methods that are used to articulate phosphorous atoms are the same methods that can be used to demonstrate the effects of social policy?

        What evidence do you have for your statement that social scientists are helpless in predicting effects of policy? In my experience it is very easy to predict effects of policy. The problem is when people ignore evidence.

        1. Do you think that the same research methods that are used to articulate phosphorous atoms are the same methods that can be used to demonstrate the effects of social policy?


          If you set up a randomized trial the way your interpret the results is the same, regardless of the subject.  You could be studying the tree frogs of wherever, boron doping in semiconductors, or school lunch programs.  This is what makes scientific learning so power.

          What evidence do you have for your statement that social scientists are helpless in predicting effects of policy? In my experience it is very easy to predict effects of policy. The problem is when people ignore evidence. 

          Point to an issue where study of the data has lead to a  consensus on the best way to implement a piece of social policy?  We do have some learning, but only when the effects are utterly disastrous.  No one is going to try collective farming or building massive dense housing projects again for the poor again, but that is only because those ideas that social scientific advocated were so utterly devastating to society and the body count was so high that their failure can’t be ignored.  

          In most things without a disastrous body count, there is no consensus.  You can easily find a horde of social scientist for and against voucher systems for schools or health care.  You can find a horde of social scientist that argue for radically different levels of wealth re-distribution.  There is no consensus because neither side has worthwhile data.

          Social scientist do the best they can, but without worthwhile data, answering politically charged questions is all but impossible.  We should be giving them the tools to answer such questions and following the traditional scientific method for determining outcomes.  That means actually testing your ideas in a verifiable way.

  5. The problem is not a lack of randomized trials.  The problem is that many of our leaders don’t want to  “improve governmental policy.”  They want to demonstrate that government can’t, and shouldn’t be improved.  They do this by breaking it.

  6. hmm i say we take humans out of this, have the randomized trials done by machines. they can test us in various policy choices and through poling to determine our overall well being and happiness, then change the parameters in turn to optimize for peak happiness and well being.
    i for one welcome our robot overlords.

    1. We can simulate the effects of a megaton nuclear bomb in a computer, but we can’t simulate the effects of a change to U.S. mercantile law.

      Is it telling, or surprising, that we as a people find it easier to predict the effects of nuclear and environmental physics than to predict the effects of our legal code?

      1.  i didnt say the robots would simulate the test…

        also predicting what humans will do can be tricky…there are so many of them and they all have their own opinions, whats more something people liked in the past they may not like in the future. tricky little creature humans eh?

  7. I work in government (in Australia) and have thought about this. Normally we put things out to trial or for comment- one thing at a time. Often this means one gets conflicting feedback. We then make changes based on this feedback, but often it is tricky to get a clear picture of which way to go. I like the idea of putting out a few alternatives in order to get a better idea of which way to go. The paper is on my reading list.

  8. I have wondered for some time if anyone (in power) has thought to employ game theorists to analyze the impact of new laws, to identify likely results and side effects, as well as identify loopholes. Seems like the kind of people who can show you how to defeat, say, the Prisoner’s Dilemma, might be better used to figure out how, say, corporations could abuse H1B visas, or how they will change as a result of Sarbanes-Oxley, or how banks might behave if Glass-Steagall is repealed.

  9. RCT and some bloody unit tests.  Could you imagine a huge tangled body of rules that nobody could grok, had no example usages, and could change and conflict without anyone finding out?  … oh wait.

Comments are closed.