For the group above, the mean affect in weekly sales is +5.6%. However, the standard deviation is 17.0% so you can’t draw any statistically significant conclusion.

For non-fiction, the effect is +45% with SD 68%.

For fiction, the effect is +55% with a crazy SD of 168% That there appears to be a good chance of getting an effect worse than -100% sales just goes to show that you should be wary about using simple stats to get true confidence intervals.

The Torr data is obviously compromised because of the advertising. People go to the site to buy a print copy and get told, “Hey, why not get this as an ebook for free!”.

Using absolute values is arguably a bit sneaky. The effect is dominated by the most popular book. The main problem is that there is no control group, so we can’t see if the sales variance is noise or an effect (whether positive or negative).

However, regarding Cory’s actual point, the study is interesting. A good way to spin the conclusion is that in a study of several books, the effect of ebooks on print book sales was not statistically significant. The null hypothesis actually works in Cory’s favour.

]]>Have bought many albums and gone to many shows because I found good new music via bittorrent, whereas I would not have checked those bands out had I been forced to pay to do so (poor grad student).

So yeah. Sheldrake once said, “The plural of anecdote is data.” People contest this (in terms of experiment design), but it’s clever.

]]>The points about statistical significance are important. As we said in the conclusion of the article, “The results of the present study must be viewed with caution. Although the authors believe that free digital book distribution tends to increase print sales, this is not a universal law. The results we found cannot necessarily be generalized to other books, nor be construed to suggest causation. The timing of a free e-bookâ€™s release, the promotion it received and other factors cannot be fully accounted for. Nevertheless, we believe that this data indicates that when free e-books are offered for a relatively long period of time, without requiring registration, print sales will increase.”

That last sentence means that when the Tor data was excluded, 13 out of the 17 books saw increased sales. Whether or not there is statistical significance there may be a practical significance.

One last point. The fact that book sales didn’t totally tank has important implications for books as open educational resources. I think thereâ€™s a huge benefit to society by making something available for free. Recently Iâ€™ve been involved with another study where eight books were downloaded 100,000 times over a ten week period. Sales increased moderately, but the point wasnâ€™t whether sales increased or decreased; here are 100,000 people who accessed works who otherwise probably wouldnâ€™t have. If free e-books aren’t harming sales, it makes it easier to have a conversation about educational benefits.

]]>Second, the researchers implicitly make the assumption that book sales are have a uniform distribution over time. The numbers are not related to the development in sales over time for each book (i.e. product life cycle), and there are no discussion on the effect of marketing activities, other events that bring attention to a book, etcetera. I have seen some other numbers that impressions on the web generate sales in physical channels, and a launch of (and probably the mere existence of) free ebooks is likely to have a similar effect (e.g. Magid Abrahamdon on The Off-Line Impact of Online Ads in Harvard Business Review, April 2008, p. 28). Not discussing potential long-tail effects is correct as the article focuses on short-term effects.

Third, they have numbers and don’t bother to calculate statistical significance? They use absolute and not relative change (percent), and they don’t discuss the effect of relative size (sales volume) and outlayers. In all four groups, one book is a major contributor to the group total (title 5, 12, 14 and 40).

Largely, the study have several methodological limitations. The problem is that the researcher don’t discuss any of these and the potential effect on their findings. Hmmm, seems that we have to rely on anecdotes after all….

The good thing is that I have a fresh example for my students in a course on data analysis next week :)

]]>I’ve downloaded a half of a dozen ebooks in the last few years (mostly through the Tor promotion). And most of those have resulted in me buying not just the book that I downloaded, but also pretty much everything else that author has written. I have a really hard time reading books by people who write poorly (I’m looking at you Goodkind), so when I find an author I like I tend to collect their books.

I’m working my way through Sir Pratchett’s stuff right now.

]]>As a correlary, publishing *books people like to read* free on-line will *increase* sales.

You know, just a wild guess. They say common sense ain’t so common any more, if it ever was.

]]>$ R

> t.test(c(-64, 341, 65, 95, 21))

One Sample t-test

data: c(-64, 341, 65, 95, 21)

t = 1.3497, df = 4, p-value = 0.2484

alternative hypothesis: true mean is not equal to 0

95 percent confidence interval:

-96.82553 280.02553

sample estimates:

mean of x

91.6

Even on the Tor sample, where they claim that there’s a negative correlation, and where there’s a decent amount of data, the p-value from the t-test is 0.2.

Nothing to see here, move along.

Right now, most good new books cost money, and people are acclimated to the idea of paying for them.

If most authors and publishers (instead of the current few) start giving away books for free in hopes of increasing sales, the previous statement will no longer be true.

The proliferation of free books will make consumers reluctant to pay for books, just as they are currently reluctant to pay for content online (where quality free material is in abundance).

So this study might be correct, but it is only correct in this moment. If all eBooks are given away free, the effect will dissipate, and publishers and authors will be undercutting their ability to sell ebooks, and be paid, in the future.

When certain newspapers started publishing free to the web fifteen years ago, no doubt their profiles were raised, and they gained readers. But that practice is now referred to as “the original sin of journalism” and responsible for cutting the legs out from underneath their industry. People are now unwilling to pay for news. They expect it should be free. If they encounter a paywall, they go elsewhere.

The same will happen to books if authors and publishers rush to release free ebooks in the hope it will increase their sales.

Furthermore, in the next decade we will trend away from paper books. When eBooks are 50% of the market, what will it mean to say free ebooks increase print sales?

I think the “free drives sales” phenomenon is real, but it is a temporary phenomenon enabled by shifts in the publishing industry, and I think if it is misinterpreted as anything else, it could lead to publishers and authors making the same mistakes journalism made fifteen years ago.

