Free ebooks correlated with increased print-book sales

A new study from two academics at BYU tracking the sales of printed books following free ebook releases found that generally, a free ebook release is correlated with increased sales. Interestingly, the exception is for a group of ebooks that were released for a week and then withdrawn -- part of's launch strategy, and a success in getting large number of people signed up to the site. Very nice to see some crunchy data in the mix.

Those who have advocated the release of free ebooks to boost print sales of book titles have been perennially dogged by arguments that they rely too heavily on the anecdote. That is, they tend to hype singular cases of success -- the wayward example of a book's sales rocketing after the viral spread of its ebook counterpart online.

However John Hilton III and David Wiley have recently examined sales for 41 print titles before and after they were released online for free. This study was just published in The Journal of Electronic Publishing and is titled 'The Short-Term Influence of Free Digital Versions of Books on Print Sales'. They organized the books they studied into four groups; three of the four groups saw increased sales after the books had been made available for free.

The Short-Term Influence of Free Digital Versions of Books on Print Sales

  1. I got a first edition of Little Brother after reading a .txt file (stayed up all night and read it straight through). Now Cory’s books are on the default list of things I look for when wandering through bookstores.

    Have bought many albums and gone to many shows because I found good new music via bittorrent, whereas I would not have checked those bands out had I been forced to pay to do so (poor grad student).

    So yeah. Sheldrake once said, “The plural of anecdote is data.” People contest this (in terms of experiment design), but it’s clever.

  2. Interesting numbers, but surprisingly limited study to be published in a peer reviewed journal. First, the study does not include a control group of similar titles that were not available for free for the same period.

    Second, the researchers implicitly make the assumption that book sales are have a uniform distribution over time. The numbers are not related to the development in sales over time for each book (i.e. product life cycle), and there are no discussion on the effect of marketing activities, other events that bring attention to a book, etcetera. I have seen some other numbers that impressions on the web generate sales in physical channels, and a launch of (and probably the mere existence of) free ebooks is likely to have a similar effect (e.g. Magid Abrahamdon on The Off-Line Impact of Online Ads in Harvard Business Review, April 2008, p. 28). Not discussing potential long-tail effects is correct as the article focuses on short-term effects.

    Third, they have numbers and don’t bother to calculate statistical significance? They use absolute and not relative change (percent), and they don’t discuss the effect of relative size (sales volume) and outlayers. In all four groups, one book is a major contributor to the group total (title 5, 12, 14 and 40).

    Largely, the study have several methodological limitations. The problem is that the researcher don’t discuss any of these and the potential effect on their findings. Hmmm, seems that we have to rely on anecdotes after all….

    The good thing is that I have a fresh example for my students in a course on data analysis next week :)

  3. I have often said that the interwebs work well as free advertising. If you are allowed to try something out for free, then you will determine what is worth buying. I am not going to pay to purchase a cheesy pop song that I’ll be sick of in a week (not even for ironic value), but something that I will truly enjoy for years I will buy as soon as I can. I like to be able to try before I buy. I would much rather read a print book than an e-book, but the e-book is good to determine whether I want to spend on the print book. If you release a product that is of little value don’t be surprised when people don’t buy it, but if you release a quality product watch your profit soar.

  4. If these results were just noise, then the difference between before and after sales should have mean 0. You can test that hypothesis:

    $ R
    > t.test(c(-64, 341, 65, 95, 21))

    One Sample t-test

    data: c(-64, 341, 65, 95, 21)
    t = 1.3497, df = 4, p-value = 0.2484
    alternative hypothesis: true mean is not equal to 0
    95 percent confidence interval:
    -96.82553 280.02553
    sample estimates:
    mean of x

    Even on the Tor sample, where they claim that there’s a negative correlation, and where there’s a decent amount of data, the p-value from the t-test is 0.2.

    Nothing to see here, move along.

  5. I can’t say I find any of that data compelling at all, especially considering the study lacks any point of reference to the sales of comparable backlist and frontlist titles over a 16 week period without e-book availability. Sales fluctuate, usually downwards.

  6. I agree, this study is seriously underpowered.

    For the group above, the mean affect in weekly sales is +5.6%. However, the standard deviation is 17.0% so you can’t draw any statistically significant conclusion.

    For non-fiction, the effect is +45% with SD 68%.

    For fiction, the effect is +55% with a crazy SD of 168% That there appears to be a good chance of getting an effect worse than -100% sales just goes to show that you should be wary about using simple stats to get true confidence intervals.

    The Torr data is obviously compromised because of the advertising. People go to the site to buy a print copy and get told, “Hey, why not get this as an ebook for free!”.

    Using absolute values is arguably a bit sneaky. The effect is dominated by the most popular book. The main problem is that there is no control group, so we can’t see if the sales variance is noise or an effect (whether positive or negative).

    However, regarding Cory’s actual point, the study is interesting. A good way to spin the conclusion is that in a study of several books, the effect of ebooks on print book sales was not statistically significant. The null hypothesis actually works in Cory’s favour.

  7. My theory is, when a book is released for only a week and then withdrawn, the perception of value for the digital copy grows, and it becomes more likely that the reader will put up with reading it on a screen vs. a paper copy. It also acquires a value for saving and trading. When the digital copy is distributed for free and is easily replacable, there is no perceived loss or redundancy to the reader to buy the print copy if they enjoy the content.

  8. It’s great to read these comments, especially those that point out flaws in the study-it’s the only way to improve. I am the first to admit that the study is far from perfect. Using historical data, comparison books, and also have stats on the number of downloads all would have strengthened the study. I am remedying those design flaws in a current study.

    The points about statistical significance are important. As we said in the conclusion of the article, “The results of the present study must be viewed with caution. Although the authors believe that free digital book distribution tends to increase print sales, this is not a universal law. The results we found cannot necessarily be generalized to other books, nor be construed to suggest causation. The timing of a free e-book’s release, the promotion it received and other factors cannot be fully accounted for. Nevertheless, we believe that this data indicates that when free e-books are offered for a relatively long period of time, without requiring registration, print sales will increase.”

    That last sentence means that when the Tor data was excluded, 13 out of the 17 books saw increased sales. Whether or not there is statistical significance there may be a practical significance.

    One last point. The fact that book sales didn’t totally tank has important implications for books as open educational resources. I think there’s a huge benefit to society by making something available for free. Recently I’ve been involved with another study where eight books were downloaded 100,000 times over a ten week period. Sales increased moderately, but the point wasn’t whether sales increased or decreased; here are 100,000 people who accessed works who otherwise probably wouldn’t have. If free e-books aren’t harming sales, it makes it easier to have a conversation about educational benefits.

  9. All I have is anecdotal, but I will say this:

    I’ve downloaded a half of a dozen ebooks in the last few years (mostly through the Tor promotion). And most of those have resulted in me buying not just the book that I downloaded, but also pretty much everything else that author has written. I have a really hard time reading books by people who write poorly (I’m looking at you Goodkind), so when I find an author I like I tend to collect their books.

    I’m working my way through Sir Pratchett’s stuff right now.

  10. How come articles never claim “causate”. Correlation is often difficult to prove causation.

  11. To a layman (i.e. not a statistician) and a writer, this is interesting, and like Mulder I want to believe. However, I’d like to see the results of a longer study. At this point, the free release pre-release is an exception rather than a rule, and I wonder how much of the sales results is due to that exception. If this were to become a common practice, would the same results occur? I think the result for this small a group over this short a period is interesting, but no compelling evidence for widespread adoption of the policy.

  12. I’m going to go out on a limb here, and suggest that publishing lousy books free on-line will decrease sales.

    As a correlary, publishing books people like to read free on-line will increase sales.

    You know, just a wild guess. They say common sense ain’t so common any more, if it ever was.

  13. Cory Doctorow is making himself famous by advocating the free strategy. That increases his book sales, for sure.

    Right now, most good new books cost money, and people are acclimated to the idea of paying for them.

    If most authors and publishers (instead of the current few) start giving away books for free in hopes of increasing sales, the previous statement will no longer be true.

    The proliferation of free books will make consumers reluctant to pay for books, just as they are currently reluctant to pay for content online (where quality free material is in abundance).

    So this study might be correct, but it is only correct in this moment. If all eBooks are given away free, the effect will dissipate, and publishers and authors will be undercutting their ability to sell ebooks, and be paid, in the future.

    When certain newspapers started publishing free to the web fifteen years ago, no doubt their profiles were raised, and they gained readers. But that practice is now referred to as “the original sin of journalism” and responsible for cutting the legs out from underneath their industry. People are now unwilling to pay for news. They expect it should be free. If they encounter a paywall, they go elsewhere.

    The same will happen to books if authors and publishers rush to release free ebooks in the hope it will increase their sales.

    Furthermore, in the next decade we will trend away from paper books. When eBooks are 50% of the market, what will it mean to say free ebooks increase print sales?

    I think the “free drives sales” phenomenon is real, but it is a temporary phenomenon enabled by shifts in the publishing industry, and I think if it is misinterpreted as anything else, it could lead to publishers and authors making the same mistakes journalism made fifteen years ago.

  14. As an author, I experimented by giving away an ebook for “Read an Ebook Week.” The ebook contained a novella and excerpts for two ebook novels, with all three also available at Amazon. It lead to increased sales for the two novels, as expected, but it also increased sales for the book I was giving away. I am not suggesting that giving away an ebook will lead to increased sales for that ebook (though JA Konrath is doing quite well with that strategy), but it is a fertile swamp of experimentation right now.

  15. There’s a great piece by Kent Anderson at The Scholarly Kitchen analyzing this study and questioning some of the factors that are dismissed as “tangential” but which are probably extremely important. Things like one of the authors being nominated two Hugo awards during the period under consideration. And the fact that the period included the holiday season when genre fiction always sees an increase in sales. The piece is well worth a look for a more balanced analysis of what, if anything, can be learned from Hilton and Wiley’s article.

